Speaker: Holden Karau

Spark Committer & Open Source Developer Advocate

Holden is a transgender Canadian open source developer advocate with a focus on Apache Spark, BEAM, and related "big data" tools. She is the co-author of Learning Spark, High Performance Spark, and another Spark book that's a bit more out of date. She is a committer on the Apache Spark, SystemML, and Mahout projects.  Prior to joining Google as a Developer Advocate she worked at IBM, Alpine, Databricks, Google (yes this is her second time), Foursquare, and Amazon. When not in San Francisco, Holden speaks internationally about different big data technologies (mostly Spark). She was tricked into the world of big data while trying to improve search and recommendation systems and has long since forgotten her original goal. Outside of work she enjoys playing with fire, riding scooters, and dancing.

Find Holden Karau at

Proposed Tracks

  • Real-World Data Engineering

    Showcasing DataEng tech and highlighting the strengths of each in real-world applications.

  • Deep Learning Applications & Practices

    Deep learning lessons using Tensorflow, Keras, PyTorch, Caffe across machine translation, computer vision.

  • AI Meets the Physical World

    The track where AI touches the physical world, think drones, ROS, NVidea, TPU and more.

  • Data Architectures You've Always Wondered About

    How did they do that? Real-time predictive pipelines at places like Uber, Self-Driving Cars at Google, Robotic Warehouses from Ocado in the UK, are all possible examples.

  • Applied ML for Software

    Practical machine learning inside the data centers and on software engineering teams.

  • Time Series Patterns & Practices

    Stocks, ad tech/real-time bidding, and anomaly detection. Patterns and practices for more effective Time Series work.