*Equal contribution, the order of authorship is interchangeable
Pre-Explored Semantic Map, constructed through prior exploration using visual language models (VLMs), has proven effective as a foundational element for training-free robotic applications. However, existing approaches assume the map’s accuracy and do not provide effective mechanisms for revising decisions based on incorrect maps. This work introduces Context-Aware Replanning (CARe), which estimates map uncertainty through confidence scores and multi-view consistency, enabling the agent to revise erroneous decisions stemming from inaccurate maps without additional labels. We demonstrate the effectiveness of our proposed method using two modern map backbones, VLMaps and OpenMask3D, and show significant improvements in performance on object navigation tasks.
We demonstrate a robot navigating in a simulated environment using the CARe framework. Equipped with a camera and a map of the environment, the robot is tasked with finding a target object.
We focus on a scenario where the first attempt fails. We show that a naive approach, which selects the second-highest score candidate, also fails to find the target object. In contrast, our proposed method, CARe, succeeds.
The video below compares navigation results using different strategies for selecting the navigation target from the map. The first column shows the initial attempt, the second column shows the second attempt with the naive strategy, and the third, fourth, and fifth columns display the second attempt using our proposed method. Our method leverages map uncertainty metrics such as classification entropy, channel average standard error, and mean pairwise KL divergence.
Task: Find bed.
We provide furthur explanation on different candidate selection strategy and the corresponding navigation results below
The first attempt is failed. The robot stops by a chair-ish object instead of a bed.
Intuitively, the robot should the pick the candidate with second-highest score. However, in this case the robot finds another chair-ish object. The reason to this might be the persistent model bias in the pretrained model.
We propose Context-Aware Replanning (CARe) to address this problem and improve the navigation performance by leveraging the map uncertainty such as classification entropy and multi-view consistency. Our method avoids the persistent model bias and successfully finds the bed in the second attempt.
Leveraged map uncertainty: classification entropy, the higher the better
Leveraged map uncertainty: channel average standard error, the lower the better
Leveraged map uncertainty: mean pairwise KL divergence, the lower the better