论文详情
- 英文标题
- Rethinking Radiology Report Generation: From Narrative Flow to Topic-Guided Findings
- 作者
- Sheng Cheng, Devika Subramanian
- 期刊/会议
- ICLR 2026 Poster
- 发表年份
- 2026 年
- 研究方向
- clinical NLP
ICLR 2026 Poster accepted paper at ICLR 2026. Vision-Language Models (VLMs) for radiology report generation are typically trained to mimic the narrative flow of human experts. However, we identify a potential limitation in this conventional paradigm. We hypothesize that optimizing for narrative coherence encourages models to rely on linguistic priors and inter-sentence correlations, which can weaken their grounding in direct visual evidence and lead to factual inaccuracies. To investigate this, we design a controlled experiment demonstrating that as textual context increases, a model's reliance on the input image systematically decays. We propose LLaVA-TA (Topic-guided and Anatomy-aware), a new fine-tuning framework that directly addresses this challenge by re-engineering the generation process.
