Harvinder Saluja's Springboot, OCI AWS, API, EIA, EISA, Data Science/Engineering and Dev/Ops BLOG

This BLOG focuses on "hands on approach" around AWS, OCI Oracle Cloud Infrastructure, Dev/Ops, MicroServices, OKTA, Oracle Fusion Middleware, Oracle Service Bus, Oracle AIA, Oracle SOA Suite, Oracle SOA Cloud/Developer Cloud, Oracle Identity Management including OID, OAM, OIM, OSSO, Oracle Big Data, WLST Scripts and Oracle Edifecs B2B Engine for HIPAA/HL7/X12/EDIFACT EDI., Kafka, Spark, Spring Boot, DevOps, AWS, GCP and Oracle Cloud

Friday, May 5, 2017

Installation of Apache Spark on Windows 10

Apache Spark is an open-source cluster-computing framework. Originally developed at the University of California, Berkeley's AMPLab, the Spark codebase was later donated to the Apache Software Foundation, which has maintained it since. Spark provides an interface for programming entire clusters with implicit data parallelism and fault-tolerance.

Please follow following instructions on installation Apache Spark on Windows 10.

Prerequisites:

Please ensure that you have installed JDK 1.8 or above on your environment.

Steps:

Installation of Scala 2.12.2

Please Install Scala after downloading it.
Scala can be downloaded from here.
Download will give you a .msi file. Follow instructions and install Scala

Installation of Spark

Spark Can be downloaded from here
I am choosing version 2.1.1 prebuit for Hadoop. Please note, I shall be running this without Hadoop.

Extract the tar file into a folder called c:\Spark
The contents of the Extract will look like

Download Winutils

Download Winutils from these links : 64 bits
Create a folder c:\Spark\Winutils\bin and copy this winutils.exe there
The folder structure will look like

Setup Environment Variables

Following environment variables will need to be setup:

JAVA_HOME: C:\jdk1.8.0_91
SCALA_HOME: C:\Program Files (x86)\scala\bin
_JAVA_OPTION: -Xms128m -Xmx256m
HADOOP_HOME: C:\Spark\WinUtils
SPARK_HOME: C:\Program Files (x86)\scala\bin

Create a folder c:\tmp\hive and give it read/write/execute privileges for all

Test Spark Environment

Navigate to SPARK_HOME/bin and execute command scala-shell

You should re ready to use Spark