Multimodal Scale Consistency and Awareness for Monocular Self-Supervised Depth Estimation

Home
|
Insights
|
Research paper
|
Multimodal Scale Consistency and Awareness for Monocular Self-Supervised Depth Estimation

Dense depth estimation is essential to scene-understanding for autonomous driving. However, recent self-supervised approaches on monocular videos suffer from scale-inconsistency across long sequences. Utilizing data from the ubiquitously copresent global positioning systems (GPS), we tackle this challenge by proposing a dynamically-weighted GPS-to-Scale (g2s) loss to complement the appearance-based losses. We emphasize that the GPS is needed only during the multimodal training, and not at inference. The relative distance between frames captured through the GPS provides a scale signal that is independent of the camera setup and scene distribution, resulting in richer learned feature representations. Through extensive evaluation on multiple datasets, we demonstrate scale-consistent and -aware depth estimation during inference, improving the performance even when training with low-frequency GPS data.

READ THE FULL PAPER

Full Name*

Email Address*

Company

Required fields

I give permission to store and process my data

I want to receive newsletter communication from NavInfo Europe

Sign up for our newsletter and get the latest insights!

Full Name

Email Address

Required field

I consent to the processing of my personal data.

Anonymize your own images

Talk to our Cybersecurity experts today!

Get in touch with our experts to learn more about our Automotive Cybersecurity solution.