- Published on
How to Install Apache Spark on a Local Machine using Windows
Table of Contents
Prerequisites
Before you start, make sure you have the following software installed on your Windows machine:
Java Development Kit (JDK): Apache Spark requires JDK version 8 or later. Download the latest version from Oracle's website and follow the installation instructions.
Python: Apache Spark supports Python 2.7, 3.4, and higher. Download the latest version of Python from Python's website and follow the installation instructions.
Winutils: Download the appropriate version of winutils.exe that corresponds to your installed Hadoop version from this GitHub repository. Create a directory named
hadoop
on yourC:\
drive and place the downloadedwinutils.exe
file in thebin
subdirectory (i.e.,C:\hadoop\bin\
).
Step 1: Download Apache Spark
- Visit the Apache Spark official website.
- Select the latest stable release of Apache Spark.
- Choose the package type as "Pre-built for Apache Hadoop".
- Click the "Download" button to download the Spark package.
Step 2: Extract Apache Spark
- Navigate to the downloaded Spark package (usually in the "Downloads" folder).
- Extract the contents of the package using a tool like 7-Zip.
- Move the extracted folder to a desired location (e.g.,
C:\Spark
).
Step 3: Set Environment Variables
- Right-click on "This PC" or "Computer" and select "Properties".
- Click on "Advanced system settings" and then the "Environment Variables" button.
- Click "New" under "System variables" to add the following variables:
Variable name:
JAVA_HOME
Variable value: (Path to your JDK installation, e.g.,C:\Program Files\Java\jdk1.8.0_291
)If you are using:
C:\Program Files (x86)\
then tryC:\progra~2\Java\jre1.8.0_361
check: progra~2Variable name:
HADOOP_HOME
Variable value:C:\hadoop
Variable name:
SPARK_HOME
Variable value: (Path to your Spark installation, e.g.,C:\Spark
)
- Edit the "Path" system variable and append the following paths:
%JAVA_HOME%\bin
%HADOOP_HOME%\bin
%SPARK_HOME%\bin
Step 4: Test Apache Spark Installation
- Open a new Command Prompt.
- Type
spark-shell
and press "Enter". If the installation is successful, you will see the Spark shell starting. - To exit the Spark shell, type
:quit
and press "Enter".
Congratulations! You have successfully installed Apache Spark on your local Windows machine.