Maximizing Driving Scene Insights: DriveTag AI Solution

Data plays a crucial role in training and improving algorithms in the fast-evolving field of autonomous driving. However, the abundance and complexity of the data involved present significant challenges. In our previous blogs, “Data Challenges in Autonomous Driving and the Need for Data Tags” and “Challenges in Building up the Driving Scene Tags,” we discussed the hurdles faced in the autonomous driving domain in extracting value from large-scale data and highlighted the critical need for data tags. In this blog, we introduce our innovative solution, “DriveTag AI”, which addresses these challenges and allows us to capture valuable and diverse driving scene information.

Maximizing Driving Scene Insights

With DriveTag AI, our primary goal is to maximize scene information extraction to derive meaningful insights from minimal information. By intelligently leveraging various sources of sensor data, we have identified five key areas of interest to focus on when deriving valuable driving scene information. Let us deep dive into these five key areas in detail.

Map Infrastructure Tags

Understanding the surrounding map infrastructure is essential for autonomous vehicles. DriveTag AI incorporates algorithms that analyze the GPS sensor information to extract relevant information such as road types (urban, highway drives), presence of critical infrastructure (like tunnels, round abouts, intersections), and other essential map features. This allows our system to create accurate and comprehensive map infrastructure tags. The list of classes on map infrastructure of interest is quite exhaustive and can be scaled as per OSM features. With the map infrastructure tags, users can quickly navigate to find the data insights about different map classes encountered in the data collection drives to know how much data was collected in urban, rural, and highway regions or to gather insights on the density of infrastructure points like roundabouts, and intersections available on a selected route. Features like ISA have a required test mileage under different map road infrastructures to be validated and certified before the homologation and SOP of vehicles. Thus, an efficient mechanism/framework is required to collect map infrastructure tags in drive collection.

Further use cases in building a database for developers and testers to quickly query and filter down to the specific challenging infrastructure data points like that of Highway entry, exit, tunnels, bridges, railway crossings, and motorway intersections are supported. Additionally, we employ an added layer of post-processing checks to eliminate faulty GPS recordings which impact the localization of the infrastructure tags based on GPS sensor recordings.

*An Example depicting the infrastructure “Tunnel” map class tag localized on the drive path of the data collection vehicle.*

*Map infrastructure classes depicting areas of collection drives which are tagged with urban, rural, highway drive regions.*

Computer Vision Class Tags

The visual perception of the driving environment is crucial for autonomous vehicles. Our solution employs cutting-edge computer vision techniques to analyze camera data and maximize the extraction of valuable insights from the vision scope. By detecting and classifying objects and obstacles, and recognizing different traffic participants in the vehicle peripheries, DriveTag AI generates precise computer vision class tags that enhance the understanding of the driving scene.

The diverse nature of various classes of interest in the driving path of AD vehicles presents enormous challenges and opportunities to extract information like in the case of traffic signs (stop signs, road works signs, speed limit signs) which influences the AD path planning and motion control systems from the perception algorithms’ ability to understand the signs and influence the drive of the AD vehicle. Insights like that of the presence of traffic lights but to also classify the traffic lights’ status present rich information for the perception algorithms training. As seen from the earlier blog, a lot of AD test vehicles deployed in the field are reported with disengagements of the systems with examples reported like AD vehicle’s failure to stop at the stop sign or proceed to drive when the traffic light has turned green or this recent example where a vehicle failed to detect the crosswalk and stop.

Vision-based tagging as seen from drive data and generation of different classes of tags from the Computer Vision models.

Let us take an example of a Perception algo developer who wants to filter and curate only frames from a large data set to find “A drive scene environment involving a pedestrian under a traffic light with a stop sign and a crosswalk on a sunny clear day in an urban region of drive” (combination of 6 different tags)can be queried fetch specifically the frames involving all the tags as per the query to send it across for downstream activities or fetch the segment of the rest of the sensor information leading up to this particular data point is made accessible for users(ex: images/video frames associated with the query to the corresponding GPS, Lidar, Radar collected data as per the queried results ). This not only provides insight into the data richness over diverse classes from datasets collected but also helps in saving costs on data hosting and data transfer for labeling efforts with reduced quality datasets.

The abundance of diverse class information that can be known from the driving scene is limitless as also seen with how a human mind assimilates the visual sensory information while driving a car to gauge the scene with the detection of pedestrians, cycles, other vehicle types, traffic signs and different class types of signs, visual cues which diverges from a routine drive path lanes to where construction is happening or when there is a lane closure which requires ego vehicle to disengage the AD systems or perhaps prepare an alternate route planning in real-time.

Considering the vast possible diverse classes of information available and needed in developing AD features for training and validation purposes, we have developed a lean architecture backbone that can be scaled incorporating new classes from detection, classification, and segmentation vision techniques and reliably fetch classes of information that do not require large scale processing power to ingest and provide tags. With our state-of-the-art developed CV techniques fused with the localization insights on the detections, the solution offers users to easily query and narrow down interesting combinations of vision classes that fetch the challenging and critical cases for AD perception stack training and development.

*An Example depicting Traffic cones visual class tag localized on the drive path of the data collection vehicle in a construction zone.*

*An Example depicting the deployment of Semantic segmentation and Detection techniques in understanding the driving scene.*

Event-Based Vehicle Dynamics Tags

Vehicle dynamics around the AD vehicle play a pivotal role in making critical decisions on motion control for maneuvers as the actor vehicles dynamics influence and impact ego vehicles options on manoeuvres. DriveTag AI leverages data from various sensors like that of camera, GPS, accelerometers, and gyroscopes, to capture critical vehicle dynamics information. By analyzing these event-based data points, such as acceleration, braking, and steering behaviour, our system generates detailed vehicle dynamics event tags, contributing to a comprehensive understanding of the driving scene.

To know more about how we derive the events from the drive data information you can follow these blogs to gain insights. The vehicle dynamics behavior-related tags from the scene of the drive are associated with a risk score which is incorporated to factor in different risk elements and weights associated with TTC, THW, Number of traffic participants, Dynamics surrounding traffic participants like velocity, acceleration, deceleration relative to ego-vehicle. Know more about the risk scoring fact o r on the blog, these tags are geo-spatially localized which makes it easier to query from event tags to other keywords associated with the events from vehicle types and speeds to filter the needed events from the vast drive data.

Vehicle dynamics-based Event tag example of Cut in from left observed in front of the test vehicle on the drive route.

Weather Tags

Weather conditions greatly impact driving scenarios and vehicle behavior. DriveTag AI incorporates weather data, such as temperature, humidity, and precipitation, to create accurate weather tags. These tags provide a diversity of the data density that is collected and help in planning future data collection efficiently for required geographic regions. Weather and lighting conditions highly influence the perception algorithm performance and impact the path planning and motion control behavior of AD vehicles. As seen in the earlier blog disengagement and fatal collisions reported at challenging lighting conditions like night drives it is important the data collection done is balanced across different weather and lighting conditions and a measure of this is available across the datasets made available for curation for the development of perception algorithms. Our weather tags are in line with the OWM (Open Weather Map) solution scoping with the grading and segmentation of different weather conditions and parameters made available for a certain geographic location at the given time of the drive.

This easy and reliable option to get weather tags provides an efficient way to attach weather tags to the vehicle collection data and gather a diversity of information related to weather and also helps in planning future drive planning based on the data incompleteness with the diverse weather data.

*An Example depicting the weather tag of rain localized on the drive path of the data collection vehicle.*

Contextual Drive Scene Tags

Context is key to understanding complex driving scenarios. DriveTag AI integrates a novel approach in generating contextual information of the scene of the drive, to provide a comprehensive understanding of the driving scene. By incorporating visual, temporal, and spatial aspects observed in the driving environment, our system generates contextual drive scene tags that enrich the captured data, enabling a more robust analysis and learning process.

Let us co-relate to how human brains comprehend information while driving based on how it observes various elements in the environment around them and makes a contextual insight of it before decisions are put into action let us take an example of the same list of classes observed through vision Pedestrian, drivable road surface and drive path/lane of the car. These 3 different classes of information on their own provide limited context to make driving decisions but as with the human brain the visual cue of all these classes can be put together to form a context that is “ Pedestrian present on the drivable road surface but is not in the drive path of the car being driven hence to decide to continue to drive along the drive path ” or “Pedestrian present on the drivable road surface but is in the drive path of the car being driven which is obstructing motion of the car hence requires the action to necessarily to break and halt the car to not impact with the Pedestrian”.

The above example provides a demonstration of how classes of tags without any context to each other related does not provide any meaningful insight into making decisions over the complexity and criticality of the event. Thus, developing AD features with critical and challenging datasets needs a way to apply contextual information on datasets to get meaningful data to be used to train and develop algorithms and build systems.

With the recent developments in the field of advanced AI and specifically in the deployment of CV techniques, we are now closer to gathering contextual information from vision data and associating meaningful contextual information tags about it about the AD drive scene.

*A contextual tag indicating when the vehicle is found obstruction by a VRU (pedestrian) on the driving path.*

*A contextual tag indicating when the vehicle is found no obstruction by a VRU (pedestrian) on the driving path.*

We are excited about the results seen on this development path and are curious about the next releases in the solution pipeline. Sounds like something that is of interest or a solution that your data needs can use? We will be happy to hear your feedback!

Conclusion

The One solution that fits all the data tagging problems simply stated does not work and that said with the DriveTag AI solution, we do acknowledge that maximizing extracting diverse data points from minimal sensor information is still our goal. There exist limits on how much information can be extracted just with the video, GPS, and time stamp information. Thus, we do look also to the future to scale the solution scope using different sensor information from the vehicle to be integrated and used on such solution setup. Information such as vehicle CAN bus data is one way to obtain different sets of vehicle dynamics information which can also be tagged and explored with.

To recap from our previous blogs, “Data Challenges in Autonomous Driving and the Need for Data Tags” and “Challenges in Building up the Driving Scene Tags” we delved into the challenges faced in the autonomous driving domain and emphasized the critical need for data tags to capture valuable information from diverse driving scenes. DriveTag AI, our innovative solution, tackles these challenges head-on, maximizing driving scene insights with minimal information.

By focusing on five key areas—map infrastructure tags, computer vision class tags, event-based vehicle dynamics tags, weather tags, and contextual drive scene tags—DriveTag AI offers a comprehensive approach to understanding the driving environment.

We value your thoughts and feedback on DriveTag AI and its potential impact on the autonomous driving industry. As we continue to refine and innovate our solution, we look forward to creating a safer, more efficient, and reliable autonomous driving experience for all. Join us in shaping the future of autonomous vehicles and let’s drive towards a better tomorrow.