Workshop: Building Recommender Systems w/ Apache Spark 2.x

Location: Cyril Magnin III

Duration: 9:00am - 4:00pm

Day of week: Monday

Level: Beginner

Prerequisites

Apache Spark has become one of the must-know big data technologies due to its speed, ease of use, and versatility. Spark can be used for performing data analysis and building big-data applications. Increasingly, companies are leveraging Apache Spark to build intelligent applications that use Machine Learning techniques. This workshop will start with covering the major features in Spark 2.x and then focus on building a recommendation system using Spark MLlib library. It will include focused and interactive hands-on exercises.

Here is what you can expect to learn from this tutorial:

  • Spark architecture and execution model
  • Structured data processing with Spark SQL, DataFrames, and Datasets
  • Streaming processing with Structure Streaming
  • Major concepts and utilities in Spark ML library for building intelligent applications
  • Build a recommender system using Spark ML library

Speaker: Hien Luu

Engineering Manager @Linkedin focused on Big Data

Hien Luu is an engineering manager at LinkedIn and he is a big data enthusiast. He is particularly passionate about the intersection between Big Data and Artificial Intelligence. Teaching is one his passions and he is currently teaching Apache Spark course at UCSC Silicon Valley Extension school. He has given presentations at various conferences like QCon SF, QCon London, Hadoop Summit, JavaOne, ArchSummit and Lucene/Solr Revolution.

Find Hien Luu at

Tracks

  • Deep Learning Applications & Practices

    Deep learning lessons using tooling such as Tensorflow & PyTorch, across domains like large-scale cloud-native apps and fintech, and tacking concerns around interpretability of ML models.

  • Predictive Data Pipelines & Architectures

    Best practices for building real-world data pipelines doing interesting things like predictions, recommender systems, fraud prevention, ranking systems, and more.

  • ML in Action

    Applied track demonstrating how to train, score, and handle common machine learning use cases, including heavy concentration in the space of security and fraud

  • Real-world Data Engineering

    Showcasing DataEng tech and highlighting the strengths of each in real-world applications.