IEEE transactions on neural networks and learning systems
Unseen From Seen: Rewriting Observation-Instruction Using Foundation Models for Augmenting Vision-Language Navigation.
Ziming Wei, Bingqian Lin, Yunshuang Nie, Jiaqi Chen, Shikui Ma, Hang Xu, Xiaodan Liang
Published: 202510.1109/TNNLS.2025.3624691
Abstract
Data scarcity is a long-standing challenge in the vision-language navigation (VLN) field, which extremely hinders the generalization of agents to unseen environments. Previous works primarily rely on additional simulator data or web-collected images/…
Preview only. Read the full abstract at the source