ML in Action

Location: Cyril Magnin III

Day of week: Tuesday

Applied Machine Learning track demonstrating how to train, score, and handle security and fraud use cases.

Track Host:
Soups Ranjan
Director of Data Science @Coinbase

Soups Ranjan is the Director of Data Science at Coinbase, one the largest bitcoin exchanges in the world. He manages the Risk & Data Science team that is chartered with preventing avoidable losses to the company due to payment fraud or account takeovers. Soups has a PhD in ECE on network security from Rice University. He has previously led the development of Machine Learning pipelines to improve performance advertising at Yelp and Flurry. He is the founder of, a round-table forum for risk professionals in San Francisco to share ideas on stopping bad actors.


10:40am - 10:50am

When Do You Use ML vs. a Rules Based System?

Soups Ranjan, Director of Data Science @Coinbase

11:00am - 11:50am

Counterfactual Evaluation of Machine Learning Models

Stripe processes billions of dollars in payments a year and uses machine learning to detect and stop fraudulent transactions. Like models used for ad and search ranking, Stripe's models don't just score—they dictate actions that directly change outcomes. High-scoring transactions are blocked before they can ever get refunded or disputed by the card holder. Deploying an initial model that successfully blocks a substantial amount of fraud is a great first step, but since your model is altering outcomes, subsequent parts of the modeling process become more difficult:

  • How do you evaluate the model? You can't observe the eventual outcomes of the transactions you block (would they have been refunded or disputed?) or the ads you didn't show (would they have been clicked?) In general, how do you quantify the difference between the world with the model and the world without it?
  • How do you train new models? If your current model is blocking a lot of transactions, you have substantially fewer samples of fraud for your new training set. Furthermore, if your current model detects and blocks some types of fraud more than others, any new model you train will be biased towards detecting that residual fraud. Ideally, new models would be trained on the "unconditional" distribution that exists in the absence of the original model.

In this talk, I'll describe how injecting a small amount of randomness in the production scoring environment allows you to answer these questions. We'll see how to obtain estimates of precision and recall (standard measures of model performance) from production data and how to approximate the distribution of samples that would exist in a world without the original model so that new models can be trained soundly.

Michael Manapat, Head of Conversion Products @Stripe

12:50pm - 1:00pm

JupyterLab: The Next Generation Jupyter Web Interface

Jason Grout, Scientific Software Developer @Bloomberg & JupyterLab / Sage Core Contributor

1:10pm - 2:00pm

Measuring Business Impact of Machine Learning System

I will provide an overview of how to do metric driven ML system development, primarily to answer following questions:

  • How do you bootstrap machine learning system for new objectives?
  • How do you tie up the ML system performances with business goals?
  • Do precision & recall always work as primary metrics?
  • How do you measure effectiveness of entire ML system in the context of product success?

These questions will be answered in the context of fraud detection case study.

Jevin Bhorania, Cash Data Science Lead @Square

2:25pm - 2:35pm

Machine Learning: Predicting Demand in Fashion

Ritesh Madan, VP Engineering @celect

2:45pm - 3:35pm

ML in Action [Talk Moved to 4:20 pm Codelab Room]

More details to come soon.


4:00pm - 4:10pm

Optimizing Fraud Model Thresholds @Airbnb

Dave Press, Data Science Manager @Airbnb

4:20pm - 5:10pm

Machine-Learning for Trust & Safety at Airbnb

In this talk, I will review some of the Trust & Safety challenges faced by Airbnb and other peer-to-peer marketplaces. Getting a deep understanding of the user’s identity is the foundation of trust for such marketplaces, where transactions are born online, but transition to offline and often intimate interactions. We shall cover the three crucial stages of establishing trustworthiness of a user:

(1) “verification” of the user’s identity;
(2) “screening” the past of the user;
(3) “predicting” the future risk in the behavior of this user.

We shall focus on the machine-learning challenges in each of these stages, and some of the solutions that have proven successful at Airbnb and Trooly.

Anish Das Sarma, Engineer Manager @Airbnb


  • Deep Learning Applications & Practices

    Deep learning lessons using tooling such as Tensorflow & PyTorch, across domains like large-scale cloud-native apps and fintech, and tacking concerns around interpretability of ML models.

  • Predictive Data Pipelines & Architectures

    Best practices for building real-world data pipelines doing interesting things like predictions, recommender systems, fraud prevention, ranking systems, and more.

  • ML in Action

    Applied track demonstrating how to train, score, and handle common machine learning use cases, including heavy concentration in the space of security and fraud

  • Real-world Data Engineering

    Showcasing DataEng tech and highlighting the strengths of each in real-world applications.