Presentation: Papers in Production Lightning Talks
Share this on:
This presentation is now available to view on InfoQ.com
Watch video with transcriptAbstract
In this slot, we offer 3 lightning talks that will cover papers of interest in topics ranging from data engineering to ML presented by various experts in the field.
"Towards a Solution to the Red Wedding Problem"
Gwen Shapira will present the paper by Christopher S. Meiklejohn, Heather Miller, and Zeeshan Lakhani.
One of the more interesting trends in the AI field is "AI on the edge". There is a lot of value in pushing computation to the edge, much of the time, the edge is where data comes from. But there are also real challenges. Gwen Shapira will present the paper "Towards a Solution to the Red Wedding Problem", where the authors introduce and present a solution to one of the biggest challenges in edge computing: Concurrent updates of shared data.
"A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise"
Roland Meerten will present the paper by Martin Ester, Hans-Peter Kriegel, Jiirg Sander and Xiaowei Xu.
This paper describes how you can group data points together by looking at the number of data points around it. Besides being an efficient way of grouping data it is also able to flag points which are likely noise. Roland will both explain the algorithm, as well as show a possible application for autonomous vehicles.
"A Machine Learning Approach to Databases Indexes"Randy Shoup will present the paper written by Alex Beutel, Tim Kraska, Ed H. Chi, Jeffrey Dean, and Neoklis Polyzotis. When we think of the "best" data structure for a workload, we classically mean the least worst - the data structure that performs least poorly given a mostly unknown data distribution. We could custom-design an optimal structure if we knew the data distribution, of course, but that involves significant programming effort. Enter machine learning, and Google's research paper "The Case for Learned Indexes" (https://ai.google/research/pubs/pub46518). By replacing a standard B-Tree with a machine learned model, this paper demonstrates a 70% improvement in speed and a 10x improvement in space on real-world workloads. That in itself is revolutionary. Far more importantly, though, it introduces us to a whole new class of techniques to build software.