Presentation: Petastorm: A Light-Weight Approach to Building ML Pipelines

Track: Papers in Production: Modern CS in the Real World

Location: Cyril Magnin III

Duration: 10:40am - 11:20am

Day of week: Tuesday

Share this on:

Abstract

Data produced and managed by Big Data systems like Apache Spark and Hive cannot be directly consumed by Deep Learning systems like Tensorflow and PyTorch. Petastorm bridges this gap by enabling direct consumption of data in Apache Parqet format into Tensorflow and PyTorch. In this talk, we describe how Petastorm facilitates tighter integration between Big Data and Deep Learning worlds; simplifies data management and data pipelines; and speeds up model experimentation.

Speaker: Yevgeni Litvin

Tech Lead @Uber

Yevgeni Litvin is a senior software engineer with Perception team at Uber Advanced Technology Group (ATG). Yevgeni builds machine learning infrastructure used to train and deploy models on autonomous vehicles.

Find Yevgeni Litvin at

2019 Tracks