You are viewing content from a past/completed QCon

Track: Handling Sequential Data Like an Expert / ML Applied to Operations

Location: Cyril Magnin II

Day of week: Wednesday

Discussing the complexities of time, including hyper loglog, count min sketch, and more / Machine Learning in the data center. Exploring topics like Dynamic rebalancing in Dataflow, Predictive auto-scaling, and fault prediction.

Track Host: Brad Klingenberg

VP Data Science @StitchFix

Brad Klingenberg leads a team of 20+ data scientists working on human-in-the-loop machine learning at Stitch Fix. His team develop the recommendation algorithms that guide our stylists, the human experts who curate the items selected for clients. We also match our clients and stylists together and measure, monitor and optimize the role of human selection in our recommendation system.

9:00am - 9:10am

Introduction to Forecasting

Franziska Bell, Senior Data Science Manager @Uber

9:20am - 10:10am

Understanding Software System Behavior With ML and Time Series Data

Powered by the rise of cloud technology and ubiquitous mobile connectivity, software systems have utterly transformed daily life and the global economy. However, the reliable operation of these systems has been made increasingly difficult by their sheer scale, complexity, and rapid pace of evolution.
In this talk we discuss how time series datasets collected from running software can be combined with machine learning techniques in order to aid in the understanding of system behaviors in order to improve performance and uptime.

David Andrzejewski, Engineering Manager @SumoLogic

10:35am - 10:45am

Deep Learning for Language Understanding (at Google Scale)

Anjuli Kannan, Software Engineer @GoogleBrain

10:55am - 11:45am

Counting is Hard: Probabilistic Algorithms for View Counting at Reddit

While counting votes has always been a core feature of Reddit's platform, only recently did we begin counting and displaying view numbers. In this talk, we explain the challenges of building a view counting system at scale, and how we used probabilistic counting algorithms to make scaling easier.

Krishnan Chandra, Data Engineer @Reddit

12:45pm - 12:55pm

Serverless for Data Science

Mike Lee Williams, Research engineer @Cloudera Fast Forward Labs

1:05pm - 1:55pm

A Cost-Sensitive Approach for Resource Allocation in Virtual Machines

Dor Kedem, Senior Data Scientist @ING Nederland

2:20pm - 2:30pm

A/B testing for Logistics: It all Depends

Jingjie Xiao, Data Scientist @Instacart

2:20pm - 2:30pm

A/B Testing for Logistics: It All Depends

At Instacart, we deliver lots of groceries. To make sure customers receive their deliveries on time, we continuously improve our dispatching engine  that decides which orders each shopper should fulfill . A big challenge is how to measure the outcome caused by a particular algorithm change.
 
In this talk, we will explain how we A/B test changes in our logistics system where neither customers or shoppers are independent. We will also discuss how multivariate regression is used to expedite our pace of innovation. 

Jingjie Xiao, Data Scientist @Instacart

2:40pm - 3:30pm

Demand Modeling @StitchFix

Stephanie Yee, Data Scientist @StitchFix

2019 Tracks