论文ICLR 2026 Poster2026 年clinical NLP

通过多粒度语言学习增强医学视觉理解

ICLR 2026 Poster accepted paper at ICLR 2026. Recent advances in image-text pretraining have significantly enhanced visual understanding by aligning visual and textual representations. Contrastive Language-Image Pretraining (CLIP) has played a pivotal role in multimodal learning. However, its focus on single-label, single-granularity alignment limits its effectiveness in complex domains such as medical imaging, where images often correspond to multiple labels across different levels of granularity. To address this, we propose Multi-Granular Language Learning (MGLL), a contrastive learning framework designed to improve both multi-label and cross-granularity alignment. Code/project link: https://github.com/HUANGLIZI/MGLL

医学影像计算医疗多模态临床语言智能论文 Multi-Granular Language Learning Medical Image Analysis Multimodal Learning ICLR 2026 ICLR 2026 Poster

论文详情

英文标题: Boosting Medical Visual Understanding From Multi-Granular Language Learning
作者: Zihan Li, Yiqing Wang, Sina Farsiu, Paul Kinahan
期刊/会议: ICLR 2026 Poster
发表年份: 2026 年
研究方向: clinical NLP