You are viewing content from a past/completed QCon -

Presentation: Document Digitization: Rethinking OCR with Machine Learning

Track: Solving Software Engineering Problems with Machine Learning

Location: Cyril Magnin III

Duration: 10:00am - 10:40am

Day of week:

Slides: Download Slides

This presentation is now available to view on

Watch video with transcript


When you think about Document digitisation from a business optimization process perspective, just performing OCR does not truly solve the problem. We at omni:us are building AI systems to support the insurance industry by handling claims. In order to achieve this we are performing various human-esque activities on so many different types of documents like page / document classification, information extraction, semantic understanding to name few. These activities helping in delivering structured information from highly unstructured documents. This structured information is further used in performing activities such as fraud detection, validation and automated claims settlement. 


This talk will outline:

  • The problems and approaches we faced when building deep learning networks to solve problems in the information extraction process.
  • Thought process on why and how we chose certain deep learning strategies
  • The requirement for supervised learning
  • Limitations of deep learning networks
  • Planning and executing research activities in short cycles
  • Evolution of team structures to support AI product building
  • Engineering practises required in building AI systems. 


Speaker: Nischal Harohalli Padmanabha

VP of Engineering and Data Science at @Omnius

Nischal HP is currently the VP of Engineering and Data science at Berlin based AI startup omni:us, which operates in the building of AI product for the insurance industry. Previously, he was a cofounder and data scientist at Unnati Data Labs, where he worked towards building end-to-end data science systems in the fields of fintech, marketing analytics, event management and medical domain. Nischal is also a mentor for data science on Springboard. During his tenure at former companies like Redmart and SAP, he was involved in architecting and building software for ecommerce systems in catalog management, recommendation engines, sentiment analyzers , data crawling frameworks, intention mining systems and gamification of technical indicators for algorithmic trading platforms. Nischal has conducted workshops in the field of deep learning and has spoken at a number of data science conferences like Pycon Canda 2018, Oreilly strata San jose 2017, PyData London 2016, Pycon Czech Republic 2015, Fifthelephant India (2015 and 2016), Anthill, Bangalore 2016. He is a strong believer of open source and loves to architect big, fast, and reliable AI systems. In his free time, he enjoys traveling with his significant other, music and groking the web.

Find Nischal Harohalli Padmanabha at