论文详情
- 英文标题
- Boosting Medical Visual Understanding From Multi-Granular Language Learning
- 作者
- Zihan Li, Yiqing Wang, Sina Farsiu, Paul Kinahan
- 期刊/会议
- ICLR 2026 Poster
- 发表年份
- 2026 年
- 研究方向
- clinical NLP
ICLR 2026 Poster accepted paper at ICLR 2026. Recent advances in image-text pretraining have significantly enhanced visual understanding by aligning visual and textual representations. Contrastive Language-Image Pretraining (CLIP) has played a pivotal role in multimodal learning. However, its focus on single-label, single-granularity alignment limits its effectiveness in complex domains such as medical imaging, where images often correspond to multiple labels across different levels of granularity. To address this, we propose Multi-Granular Language Learning (MGLL), a contrastive learning framework designed to improve both multi-label and cross-granularity alignment. Code/project link: https://github.com/HUANGLIZI/MGLL
