Apache Spark standalone cluster on Windows

  1. Master
  2. Worker
  3. Resource Manager
Spark cluster overview
  1. Avoid having spaces in the installation folder of Hadoop or Spark.
  2. Always start Command Prompt with Administrator rights i.e with Run As Administrator option
  1. Download JDK and add JAVA_HOME = <path_to_jdk_> as an environment variable.
  2. Download Spark and add SPARK_HOME=<path_to_spark>. If you choose to download spark pre-built with particular version of hadoop, no need to download it explicitly in step 3.
  3. Download Hadoop and add HADOOP_HOME=<path_to_hadoop> and add %HADOOP_HOME%\bin to PATH variable.
  4. Download winutils.exe (for the same Hadoop version as above) and place it under %HADOOP_HOME%\bin.
bin\spark-class org.apache.spark.deploy.master.Master --host <IP_Addr>
bin\spark-class org.apache.spark.deploy.worker.Worker spark://<master_ip>:<port> --host <IP_ADDR>
http://<MASTER_IP>:8080
Spark UI

--

--

--

Technology Enthusiast | Big Data Developer | Amateur Cricketer | Technical Lead Engineer @ eQ Technologic | https://www.bugdbug.com

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Setting Up Single Master Kubernetes On Bare Metal

Node Vs Python — Which one to choose in 2022?

Ethereum Light client with React

Light Probes vs. Screen Space Reflections in Unity!

Building Something From Scratch (Week 6 at Encora Academy)

Comparing modern multiplatform desktop frameworks on the image browser app

Integrating Spring backend with Nexus server used as Maven and Docker repository

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Amar Gajbhiye

Amar Gajbhiye

Technology Enthusiast | Big Data Developer | Amateur Cricketer | Technical Lead Engineer @ eQ Technologic | https://www.bugdbug.com

More from Medium

Integrating Confluent Schema Registry with Apache Spark applications

Kafka Dynamic Configuration & Multiple Error Handler

How To Obtain Kafka Consumer Lags in Pyspark Structured Streaming (Part 1)

How to Flatten Json Files Dynamically Using Apache Spark(Scala Version)