You are viewing content from a past/completed QCon

Track: Sequential Data: Natural Language, Time Series, and Sound

Location: Cyril Magnin I

Day of week: Wednesday

Techniques, practices, and approaches around time series and sequential data. Expect topics including image recognition, NLP/NLU, preprocessing, & crunching of related algorithms.

A time series is a series of data points indexed (or listed) in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Sequential data looks at data problems where the ordering of data matters. Image processing and NLP/NLU fall in this space. The Groking Timeseries and Sequential Data track looks at the role of timeseries and sequential data in modern application development.

Track Host: Jendrik Jördening

Data Scientist @Nooxit

Jendrik is Head of Data Science at a stealth startup. He formerly worked at Aurubis and Akka Germany on Data Science and Deep Learning in the field of industry 4.0 and autonomous machines.

At the same time he took part in the Udacity Self-Driving Car Nanodegree, participating with a group of other Udacity student in the Self-Racing Cars event at the Thunderhill race-track in California.

9:00am - 9:40am

Practical NLP for the Real World

Most companies have an abundance of text data they could leverage information from. Whether you'd like to classify incoming support ticket requests, detect when a candidate is available for an interview or suggest useful replies to emails, leveraging NLP techniques can often improve your products and allow you to build entirely new ones.

The field of NLP is often not the most approachable, however. The field ranges from linguistics to cutting edge deep learning, so it can be hard to find tools to build a practical product in a reasonable timeframe.

In this talk, we will cover concrete examples of how to build practical applications using NLP. In the real world, most gains come from improvements to the pipeline, not necessarily the model. For this reason, we will dive into data visualization and labelling, as well as model validation.

We will walk through code, plots, and leave ample room for practical questions.

Emmanuel Ameisen, Head of AI @InsightDataSci

10:00am - 10:40am

Forecasting with Prophet

Forecasting is a common data science task that helps organizations with capacity planning, goal setting, and anomaly detection. Despite its importance, there are serious challenges associated with producing reliable and high quality forecasts – especially when there are a variety of time series and analysts with expertise in time series modeling are relatively rare. To address these challenges, we describe a practical, modular approach to forecasting “at scale” based on a flexible curve fitting procedure that produces high quality forecasts across a wide variety of business time series.

Sean Taylor, Core Statistics Team @Facebook

11:00am - 11:40am

Uber's Spatio-Temporal Data

Uber’s Marketplace is the algorithmic brains and decision engine behind our ride-sharing services. Marketplace Forecasting builds and deploys spatio-temporal models and forecasts to enable hyperlocal decision making. To model the physical world requires us to reimagine how we look at the basic problem of forecasting.

 

We will discuss the challenges of modeling the influence of external signals, such as global news events and holidays in our Marketplace. In the majority of cases, there is limited historical data, and in cases where cities have just launched, there is no data at all. We will briefly cover how different techniques ranging from linear to deep learning models, generalized embeddings and cutting edge AI to help us forecast the future states of the Marketplace and even predict the onset of extreme events before they occur!

Chintan Turakhia, Engineering Manager @Uber

12:00pm - 12:40pm

On a Deep Journey Towards Five Nines

At PayPal, achieving four nines of availability is the norm. In the pursuit of exponentially complex additional nines, the company has recently embarked on applying deep learning to forecasting datacenter metrics. Seq2Seq networks are ripe for application to this difficult problem, but little has been shared to the open community.

 

Aashish Sheshadri shines a light on how PayPal applies Seq2Seq networks to forecasting CPU and memory metrics at scale. Forecasting enables alerting flows to get a head start reducing MTTD, augment autoremidiation, and consequentially aid MTTR. In doing so Aashish describes the ecosystem and tooling that enables developers at PayPal to experiment, build and train ML models while creating reusable, reproducible and sharable work in the Jupiter and Kubernetes ecosystem. 

Aashish Sheshadri, Staff ML Research Engineer @PayPal

1:40pm - 2:20pm

Deep Learning with Audio Signals: Prepare, Process, Design, Expect

Is deep learning Alchemy? No! But it heavily relies on tips and tricks, a set of common wisdom that probably works for similar problems. In this talk, I’ll introduce what the audio/music research societies have discovered while playing with deep learning when it comes to audio classification and regression -- how to prepare the audio data and preprocess them, how to design the networks (or choose which one to steal from), and what we can expect as a result.

Keunwoo Choi, Research Scientist @Spotify

2:40pm - 3:20pm

Panel: Sequential Data

In this panel, we will discuss how the different fields within sequential data processing can benefit from each other, what the future trends are that we expect and take your questions.

Joel Grus, Senior Research Engineer @allen_ai
Keunwoo Choi, Research Scientist @Spotify
Emmanuel Ameisen, Head of AI @InsightDataSci
Sean Taylor, Core Statistics Team @Facebook

2019 Tracks

  • Sequential Data: Natural Language, Time Series, and Sound

    Techniques, practices, and approaches around time series and sequential data. Expect topics including image recognition, NLP/NLU, preprocess, & crunching of related algorithms.

  • ML in Action

    Applied track demonstrating how to train, score, and handle common machine learning use cases, including heavy concentration in the space of security and fraud

  • Deep Learning in Practice

    Deep learning use cases around edge computing, deep learning for search, explainability, fairness, and perception.