IEEE transactions on neural networks and learning systems

Unseen From Seen: Rewriting Observation-Instruction Using Foundation Models for Augmenting Vision-Language Navigation.

Ziming Wei, Bingqian Lin, Yunshuang Nie, Jiaqi Chen, Shikui Ma, Hang Xu, Xiaodan Liang

Published: 202510.1109/TNNLS.2025.3624691

Abstract

Data scarcity is a long-standing challenge in the vision-language navigation (VLN) field, which extremely hinders the generalization of agents to unseen environments. Previous works primarily rely on additional simulator data or web-collected images/…

Preview only. Read the full abstract at the source

View at DOI