You are viewing content from a past/completed QCon -

Presentation: Papers in Production Lightning Talks

Track: Papers in Production: Modern CS in the Real World

Location: Cyril Magnin III

Duration: 3:20pm - 4:00pm

Day of week:

This presentation is now available to view on

Watch video with transcript


In this slot, we offer 3 lightning talks that will cover papers of interest in topics ranging from data engineering to ML presented by various experts in the field.


"Towards a Solution to the Red Wedding Problem" 

Gwen Shapira will present the paper by Christopher S. Meiklejohn, Heather Miller, and Zeeshan Lakhani.


One of the more interesting trends in the AI field is "AI on the edge". There is a lot of value in pushing computation to the edge, much of the time, the edge is where data comes from. But there are also real challenges. Gwen Shapira will present the paper "Towards a Solution to the Red Wedding Problem", where the authors introduce and present a solution to one of the biggest challenges in edge computing: Concurrent updates of shared data.


"A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise"

Roland Meerten will present the paper by Martin Ester, Hans-Peter Kriegel, Jiirg Sander and Xiaowei Xu.


This paper describes how you can group data points together by looking at the number of data points around it. Besides being an efficient way of grouping data it is also able to flag points which are likely noise. Roland will both explain the algorithm, as well as show a possible application for autonomous vehicles.


Randy Shoup will present the paper written by Alex Beutel, Tim Kraska, Ed H. Chi, Jeffrey Dean, and Neoklis Polyzotis. 
When we think of the "best" data structure for a workload, we classically mean the least worst - the data structure that performs least poorly given a mostly unknown data distribution. We could custom-design an optimal structure if we knew the data distribution, of course, but that involves significant programming effort. Enter machine learning, and Google's research paper "The Case for Learned Indexes" ( By replacing a standard B-Tree with a machine learned model, this paper demonstrates a 70% improvement in speed and a 10x improvement in space on real-world workloads. That in itself is revolutionary. Far more importantly, though, it introduces us to a whole new class of techniques to build software.

Speaker: Gwen Shapira

Principal Data Architect @Confluent, PMC Member @Kafka, & Committer Apache Sqoop

Gwen is a principal data architect at Confluent helping customers achieve success with their Apache Kafka implementation. She has 15 years of experience working with code and customers to build scalable data architectures, integrating microservices, relational and big data technologies. She currently specializes in building real-time reliable data processing pipelines using Apache Kafka. Gwen is an author of “Kafka - the Definitive Guide”, "Hadoop Application Architectures", and a frequent presenter at industry conferences. Gwen is also a committer on the Apache Kafka and Apache Sqoop projects. When Gwen isn't coding or building data pipelines, you can find her pedaling on her bike exploring the roads and trails of California, and beyond.

Find Gwen Shapira at

Speaker: Randy Shoup

VP Engineering @WeWork

Randy is a 25-year veteran of Silicon Valley, and has worked as a senior technology leader and executive at companies ranging from small startups, to mid-sized places, to eBay and Google. Randy is currently VP Engineering at WeWork in San Francisco. He is particularly passionate about the nexus of culture, technology, and organization.

Find Randy Shoup at

Speaker: Roland Meertens

Machine Learning Engineer @Autonomous Intelligent Driving

Roland Meertens is Machine Learning Engineer at Autonomous Intelligent Driving. He works on the machine learning side of the perception software stack that will be deployed to the autonomous vehicles that will soon roam urban environments in Germany.

Find Roland Meertens at