Skip to content

Spark and Python for Big Data with PySpark. Distributed Machine Learning

Notifications You must be signed in to change notification settings

sambilkar/Spark-and-Python-for-Big-Data-with-PySpark

Repository files navigation

Spark-and-Python-for-Big-Data-with-PySpark

Python + Spark = PySpark

why Learn Python and Spark?

Spark has been reported to be one of the most valuable tech skills to learn by Indeed.com!Demand for Spark and Big Data skills has exploded! spark is quickly becoming one of the most powerful Big Data tools!Run programs up to 100x faster than Hadoop MapReduce in memory

  1. Setting Up Python [VM + ubuntu]
  2. Databricks Setup
  3. Local Virtual Setup [Python + Spark = PySpark]
  4. AWS EC2 PySpark
  5. AWS EMR Cluster Setup
  6. Crash Course Python
  7. Learn Spark basics
  8. Learn Spark's Machine learning Library using the latest DataFrame API
  9. Spark Streaming!

Releases

No releases published

Packages

No packages published