Advances of Computer Vision for Urban Transportation

Author: The FourthBrain Team • August 31, 2020

Are collision avoidance systems getting any better?

Urban transportation and Autonomous Drive (AD) have been hot topics for Industrial and Academic Research for over four years now. This research field not only highlighted the importance of computer vision for the transportation and automotive industry, but it also ushered a paradigm shift where Original Equipment Manufacturers (OEMs), Tier 1s, startups, national and state governments and academia all began sharing resources to help reach the goals for automation level 5.

Level 5 Automation is out of reach with current research limitations

Given that the Society of Automotive Engineers (SAE) has established the five levels of automation – level 1 (hands on), level 2 (hands off), level 3 (eyes off), level 4 (mind off) and level 5 (steering wheel optional) – and that current automotive standards remain between level 2 and level 4, urban transportation has surely come a long way! For instance, a well-known Level 2+ feature is the adaptive cruise control, where the radar camera unit is capable of sensing and subsequently invoking automatic acceleration and deceleration to maintain vehicle speed and distances with adjoining vehicles. 

A car measures the distance to the next vehicle


While such functionalities significantly improve the quality of driving for highways and sparsely populated roads, the complexities posed by urban scenarios call for more innovative solutions.

As an active contributor to the automotive domain, I can fairly say that AD level 5 is highly visionary, since its calls for painting a picture for a future that has multiple environmental participants (such as automotive, weather, road user intentions, road-based restrictions, road condition restrictions etc.) to it, and for which an optimal solution can never be engineered by any one participant in its silo! 

The “Guardian” System 

This calls for collaborative research initiatives between the environmental participants, much like the research at Toyota Research Institute (TRI).

In TRIs Technical Report, Section 8 demonstrates the notion of a “Guardian” system that could possibly override a driver’s commands if the vehicle computes a possible impending collision, especially at urban intersections, despite the driver’s efforts that may be untimely or unsuccessful. Such systems, however, would have high computational complexities if the burden of continuous environment sensing and inferencing were to fall on each vehicle individually. In an effort to relax the environment sensing requirements for vehicles, Section 8.21 further presents the idea of leveraging smart bicycle sensors for mixed traffic scenarios to avoid bicycle-vehicle collisions that may even be occluded by larger vehicles. Other advances to relax environmental sensing compute from individual vehicles includes object detection and road-user tracking and intention prediction using traffic cameras. In the image below, notice the bounding boxes per vehicle are numbered to imply a tracked instance, that can then help predict road user trajectory and intentions. 

Vehicles on the road are tracked with a unique number


Although limited in its field of view and range, such information can be securely transmitted to individual vehicles to warn vehicles for pedestrians or bicyclists that may be occluded by larger vehicles, such as a truck.

Long range Lidar

Another significant advancement for automotive perception stack has been with long range Lidars, such as the Luminar Lidar, where a moving object detection range of up to 250m would imply a vehicle reaction time of about 7.5 seconds for a vehicle traveling at 70mph. This improves the vehicle reaction time by over three times over normal automotive camera standards, which can be crucial to avoid collision risk on highways. Although there is a significant debate in the urban transportation realm today regarding the utility of Lidar, its importance towards long range target detection and tracking is indeed significant. The fact that long range perception (from sensors such as a Lidar) is aided by sensing from a height is both intuitive and historically accepted. Take for example the illustrations from an almost 50-year-old Indian comic strip of Chacha Chaudhary, where tall Sabu sits on top of Chacha’s truck to look for impending perils further down in road…. much like the job a Lidar seeks to perform for cars and trucks in urban scenarios!

Sabu sits on top of Chacha Chaudhary's car


A lidar system on top of a modern car


As a creative visionary myself, I like to think that though the final vision for level 5 urban transportation may be different from our vision today,  the sensors and algorithms pertaining to the perception stack for vehicles and its environment will have a significant role to play in the final version. As the Director of ML Curriculum at FourthBrain, I am thrilled to introduce and mentor such relevant project topics with our cohorts going forward.

Cheers to continuing collaborative research initiatives to attain this vision collectively!