This 8-day hands-on workshop introduces Apache Spark, the open-source cluster computing framework with in-memory processing that makes analytics applications up to 100 times faster compared to technologies in wide deployment today. Developed in the AMPLab at UC Berkeley, Spark can help reduce data interaction complexity, increase processing speed and enhance data-intensive, near-real-time applications with deep intelligence. Highly versatile in many environments, and with a strong foundation in functional programming, Spark is known for its ease of use in creating algorithms that harness insight from complex data. Spark was elevated to a top-level Apache Project in 2014 and continues to expand today.
Between: 3-25 September 2016 (every Saturday and Sunday, and the actual workshop dates are 3, 4, 10, 11, 17, 18, 24, 25). The workshop lasts 4 hours and will be held in 12:00-18:00 interval. The schedule will be decided at the end of August.
Private communications will be sent to the selected participants to announce further details (after registration is complete and the list of participants is finalized).
Introduction to Data Analysis with Spark
Programming with RDDs
Working with Key-Value Pairs
Running on a Cluster
Structured Data with Spark SQL
Building Interactive Data Analytics Apps With Flask and Spark
Advanced Spark Programming
Machine Learning with MLlib
Parallel graph processing with GraphX
This workshop requires a solid background in functional programming. Knowledge of Python is nice-to-have, but not mandatory.
You can register for the workshop using the online registration form.
Deadline: September 2nd, 23:59.
If you have any questions, please ask them here.