Neural networks : the official journal of the International Neural Network Society
RCVQA: Visual question answering model based on reading comprehension.
Deguang Chen, Jianrui Chen, Zhongshi Shao, Maoguo Gong
Published: 202510.1016/j.neunet.2025.108365
Abstract
Visual Question Answering (VQA) combines computer vision and natural language processing to answer image-related questions. Current models have three shortcomings: (1) Their limited interaction with other established fields hinders deep semantic anal…
Preview only. Read the full abstract at the source