Presentation: Advanced Topics in Autonomous Driving using Deep Learning
Share this on:
Abstract
Autonomous vehicles need to perceive their surroundings and analyze it to make decisions and act in an environment. More specifically, the autonomous vehicle detects objects on the road and maneuver through the traffic utilizing smart functional modules. In recent years, artificial intelligence, in particular, deep neural networks, have been used widely to build these smart functional modules.
While object detection (putting bounding boxes around objects), and semantic segmentation (labeling each pixel in an image) has been the focus of many researchers in autonomous driving, these methods may fall short when it comes to forming a better social understanding of pedestrian intent. In this talk, we present our approach to pedestrian intent prediction and communication, which leverages more complex computer vision algorithms that estimate human pose rather than bounding boxes or pixel labels.
The increasingly sophisticated models, like the pose estimation networks we describe, show tremendous promise as they prove to be robust at approximating complex and non-linear mapping functions from images to outputs. However, these models are typically large and have a huge number of parameters resulting in a steep cost in terms of training and inference time resource requirements. This makes the use of these networks challenging on resource and power constrained embedded systems. In this talk, we also show that the compression of neural networks results in faster predictions with smaller deep neural networks.