Workshop: Building Recommender Systems w/ Apache Spark 2.x

Location: Cyril Magnin III

Duration: 9:00am - 4:00pm

Day of week: Monday

Level: Beginner

Prerequisites

Apache Spark has become one of the must-know big data technologies due to its speed, ease of use, and versatility. Spark can be used for performing data analysis and building big-data applications. Increasingly, companies are leveraging Apache Spark to build intelligent applications that use Machine Learning techniques. This workshop will start with covering the major features in Spark 2.x and then focus on building a recommendation system using Spark MLlib library. It will include focused and interactive hands-on exercises.

Here is what you can expect to learn from this tutorial:

  • Spark architecture and execution model
  • Structured data processing with Spark SQL, DataFrames, and Datasets
  • Streaming processing with Structure Streaming
  • Major concepts and utilities in Spark ML library for building intelligent applications
  • Build a recommender system using Spark ML library

Speaker: Hien Luu

Engineering Manager @Linkedin focused on Big Data

Hien Luu is an engineering manager at LinkedIn and he is a big data enthusiast. He is particularly passionate about the intersection between Big Data and Artificial Intelligence. Teaching is one his passions and he is currently teaching Apache Spark course at UCSC Silicon Valley Extension school. He has given presentations at various conferences like QCon SF, QCon London, Hadoop Summit, JavaOne, ArchSummit and Lucene/Solr Revolution.

Find Hien Luu at

Proposed Tracks

  • Real-World Data Engineering

    Showcasing DataEng tech and highlighting the strengths of each in real-world applications.

  • Deep Learning Applications & Practices

    Deep learning lessons using Tensorflow, Keras, PyTorch, Caffe across machine translation, computer vision.

  • AI Meets the Physical World

    The track where AI touches the physical world, think drones, ROS, NVidea, TPU and more.

  • Data Architectures You've Always Wondered About

    How did they do that? Real-time predictive pipelines at places like Uber, Self-Driving Cars at Google, Robotic Warehouses from Ocado in the UK, are all possible examples.

  • Applied ML for Software

    Practical machine learning inside the data centers and on software engineering teams.

  • Time Series Patterns & Practices

    Stocks, ad tech/real-time bidding, and anomaly detection. Patterns and practices for more effective Time Series work.