Track: Papers in Production: Modern CS in the Real World

Location: Cyril Magnin III

Day of week: Tuesday

What are the papers making a real-world impact today? This track looks at important papers that are influencing and changing software today. We're exploring topics around speech, infrastructure, self-driving cars, GANs, probabilistic data structures, and more on deep learning. The Papers In Production track aims to show research that is being used in production.

Track Host: Sid Anand

Chief Data Engineer @PayPal

Sid Anand currently serves as PayPal's Chief Data Engineer, focusing on ways to realize the value of data. Prior to joining PayPal, he held several positions including Agari's Data Architect, a Technical Lead in Search @ LinkedIn, Netflix’s Cloud Data Architect, Etsy’s VP of Engineering, and several technical roles at eBay. Sid earned his BS and MS degrees in CS from Cornell University, where he focused on Distributed Systems. In his spare time, he is a maintainer/committer on Apache Airflow, a co-chair for QCon, and a frequent speaker at conferences. When not working, Sid spends time with his wife, Shalini, and their 2 kids.

11:40am - 12:20pm

Scaling Emerging AI Applications with Ray

The next generation of AI applications will continuously interact with the environment and learn from these interactions. To develop these applications, data scientists and engineers will need to seamlessly scale their work from running interactively to production clusters. In this talk, I’ll cover some major open source AI + Data Science libraries my collaborators and I at the RISELab have been working on.
At a high level, I’ll talk about my work on the following: Ray, a distributed execution framework for emerging AI applications; Tune, a scalable hyperparameter optimization framework for reinforcement learning and deep learning; RLlib, an open-source library for reinforcement learning that offers both a collection of reference algorithms and scalable primitives for composing new ones; and Modin, an open-source dataframe library for scaling pandas workflows by changing one line of code.

Peter Schafhalter, Research Assistant @ucbrise

1:20pm - 2:00pm

Scaling Deep Learning

Prabhat , Data and Analytics Group Lead @NERSC

2:20pm - 3:00pm

FAIR : Advances in Speech at Facebook

Vitaliy Liptchinsky, Research Engineering Manager @Facebook

4:20pm - 5:00pm

Building Data Products for Social Good

Facebook partners with humanitarian and academic organizations, as well as community-driven projects, like OpenStreetMap, on a number of Data for Good efforts. Examples of the outputs are

  • the High-Resolution Settlement Layer, for which we identify the locations of human-built structures from high-resolution satellite images and add population data to it in collaboration with Columbia University,
  • our Disaster Maps, which contain aggregated, anonymized information about the availability of network coverage and power availability, as well as human mobility in the context of natural disasters,
  • our large-scale input into OpenStreetMap, for which we detect roads from high-resolution satellite images, prepare them for human review, and feed the results into OpenStreetMap

We will present details about the methods, challenges, and community feedback involved in producing these datasets, as well as the impact they've each had over the last two years.

Andreas Gros, Data Scientist @Facebook
Shankar Iyer, Data Scientist @Facebook

2019 Tracks

  • Groking Timeseries & Sequential Data

    Techniques, practices, and approaches around time series and sequential data. Expect topics including image recognition, NLP/NLU, preprocess, & crunching of related algorithms.

  • Deep Learning in Practice

    Deep learning use cases around edge computing, deep learning for search, explainability, fairness, and perception.