Launch Spark on Windows - Simplified
using Kubernetes on Docker
Estimated time for reading this article - 15 minutes and an hour for LAB setup
Couple of years back it was such a big task to setup Spark on windows. Thanks to the new world of container (Docker) and orchestration (Kubernetes). It has become a piece of cake now.
In the last couple of weeks, I got few questions from some dear friends, if there is a way to launch Spark shell using Kubernetes or Docker and that prompted me to write this article.
My answer is YES there is and how ? Simply by running Kubernetes on Docker.
Let's get started.
Step 1:
First and most important step is to have docker engine / docker hub installed and running on your machine. I tried this setup on Windows 8 and Windows 10 and it worked exactly the same way. Ofcourse, you will need to signup to have an account on https://hub.docker.com/
See here for setup instructions: https://docs.docker.com/docker-for-windows/install/
Once you are done with setup, launch windows command prompt to run commands given in the following steps
Step 2: docker ps
By default, no container is up and running. Our goal is to launch minikube on Docker and then use it to setup Spark.
If you want to learn about minikube, here is the link https://kubernetes.io/docs/setup/learning-environment/minikube/
Step 3: minikube start
Step 4: docker ps
You should see minikube container running
Step 5: kubectl cluster-info
Step 6: Create deployment and services for Spark master and worker
kubectl apply -f https://raw.githubusercontent.com/big-data-europe/docker-spark/master/k8s-spark-cluster.yaml
Step 7: Launch Spark Shell
kubectl run spark-base --rm -it --labels="app=spark-client" --image bde2020/spark-base:2.4.5-hadoop2.7 -- bash ./spark/bin/spark-shell --master spark://spark-master:7077 --conf spark.driver.host=spark-client
Step 8: Try out some RDD commands
Ctrl + c to exit from Spark-shell.
Enjoy !!
Second Vice President at Northern Trust
4yWill be connecting with you soon 😊
Thank you Mujtaba.. glad you liked it.
Software Test Engineer @ Ecosia 🌳 a better planet with every search
4yThanks, Amit Singh, Very nicely explained and easy to follow.