3D perception is an essential task for autonomous driving, and thus building the most accurate, computationally efficient, fast, and label efficient models is of great interest. My research is particularly aimed at building detection models in the label-efficient (semi/self-supervised) and offline (auto-labeling) settings, areas which have both been under-explored in the literature. To improve 3D detection in these settings, I look to leverage sensor fusion (camera and LiDAR especially) and temporal fusion. In the off-board setting, 3D detection can be greatly enhanced by leveraging entire video sequences to label particular frames by a) providing different geometric views of objects of interest, allowing for better characterization and b) tracking object trajectories which can be used to aid in localization. Meanwhile, sensor fusion between camera and LiDAR allows for objects to be tracked across multiple domains, allowing for occluded or far away objects to be associated through time to further improve detection.
Project currently funded by: Federal