You are viewing content from a past/completed QCon

Presentation: Streaming SQL to Unify Batch & Stream Processing W/ Apache Flink @Uber

Track: Real-world Data Engineering

Location: Cyril Magnin III

Duration: 2:40pm - 3:30pm

Day of week: Wednesday

Share this on:


SQL is the lingua franca for querying and processing data. To this day, it provides nonprogrammers with a powerful tool for analyzing and manipulating data. But with the emergence of stream processing as a core technology for data infrastructures, can you still use SQL and bring real-time data analysis to a broader audience?

The answer is yes, you can. SQL fits into the streaming world very well and forms an intuitive and powerful abstraction for streaming analytics. More importantly, you can use SQL as an abstraction to unify batch and streaming data processing. Viewing streams as dynamic tables, you can obtain consistent results from SQL evaluated over static tables and streams alike and use SQL to build materialized views as a data integration tool.

Fabian Hueske and Shuyi Chen explore SQL’s role in the world of streaming data and its implementation in Apache Flink and cover fundamental concepts, such as streaming semantics, event time, and incremental results. They also share their experience using Flink SQL in production at Uber, explaining how Uber leverages Flink SQL to solve its unique business challenges and how the unified stream and batch processing platform enables both technical or nontechnical users to process real-time and batch data reliably using the same SQL at Uber scale.

Speaker: Fabian Hueske

Apache Flink PMC Member & Co-Founder @dataArtisans

Fabian Hueske is a committer and PMC member of the Apache Flink project. He was one of the three original authors of the Stratosphere research system, from which Apache Flink was forked in 2014. Fabian is a cofounder of data Artisans, a Berlin-based startup devoted to fostering Flink, where he works as a software engineer and contributes to Apache Flink. He holds a PhD in computer science from TU Berlin and is currently spending a lot of his time writing a book on "Stream Processing with Apache Flink".

Find Fabian Hueske at

Speaker: Shuyi Chen

Senior Software Engineer II @Uber

Shuyi Chen is a senior software engineer at Uber working on building scalable real-time data platform and solutions. He is the tech lead of Uber’s stream processing platform team. Before, he built Uber’s real-time complex event processing platform for marketplace, which powers 100+ production real-time use cases. Shuyi has years of experience in storage infrastructure, data infrastructure, and Android and iOS development at both Google and Uber.

Find Shuyi Chen at

2019 Tracks

  • ML in Action

    Applied track demonstrating how to train, score, and handle common machine learning use cases, including heavy concentration in the space of security and fraud

  • Deep Learning in Practice

    Deep learning use cases around edge computing, deep learning for search, explainability, fairness, and perception.

  • Handling Sequential Data Like an Expert / ML Applied to Operations

    Discussing the complexities of time (half track) and Machine Learning in the data center (half track). Exploring topics from hyper loglog to predictive auto-scaling in each of two half-day tracks.

    Half-day tracks