Sunday, March 15, 2020

Apache Spark 2.4.5: Installation on Ubuntu on AWS.


Apache Spark 2.4.5: Installation on Ubuntu on AWS.

  • Download the latest release of Spark here.
  • Unpack the archive.
  •  tar -xvf spark-2.4.5-bin-hadoop2.7.tgz
  •  Move the resulting folder and create a symbolic link so that you can have multiple versions of Spark installed.
  • sudo mv spark-2.4.5-bin-hadoop2.7 /usr/local 
  • sudo ln -s /usr/local/spark-2.4.5-bin-hadoop2.7/ /usr/local/spark
  • cd spark-2.4.5-bin-hadoop2.7/ 
  • Also add SPARK_HOME to your environment.
  • export SPARK_HOME=/usr/local/spark 
  • Start a standalone master server. At this point you can browse  to http://localhost:8080/, to view the status
  • $SPARK_HOME/sbin/start-master.sh 
starting org.apache.spark.deploy.master.Master, logging to /usr/local/spark/logs/spark-osboxes-org.apache.spark.deploy.master.Master-1-osboxes.out 




Start a Slave Process

$SPARK_HOME/sbin/start-slave.sh spark://osboxes:7077

To get this to work, make an entry in /etc/hosts as


27.0.0.1       localhost
127.0.1.1       osboxes

To check the Logs

vi /usr/local/spark/logs/spark-osboxes-org.apache.spark.deploy.worker.Worker-1-osboxes.out 

Test out the Spark shell. You’ll note that this exposes the native Scala interface to Spark. 
    $SPARK_HOME/bin/spark-shell



  • To use Py Spark
$SPARK_HOME/bin/pyspark




To Stop the Slave:
$SPARK_HOME/sbin/stop-slave.sh

To Stop the master:

$SPARK_HOME/sbin/stop-master.sh























Amazon Bedrock and AWS Rekognition comparison for Image Recognition

 Both Amazon Bedrock and AWS Rekognition are services provided by AWS, but they cater to different use cases, especially when it comes to ...