AI4Meder
返回论文列表
论文ICLR 2026 Poster2026 年clinical NLP

迈向医学图像分割中的文本-掩膜一致性

ICLR 2026 Poster accepted paper at ICLR 2026. Vision-language models for medical image segmentation often produce masks that conflict with the accompanying text, especially under multi-site/multi-lesion descriptions. We trace this failure to two factors: (i) highly templated and repetitive clinical language causes one-to-one hard contrastive learning to yield numerous false negatives, weakening cross-modal alignment; and (ii) predominantly vision-driven, one-way cross-attention lacks a language-dominant, spatially aware pathway, hindering effective injection of textual semantics into the spatial visual domain. To this end, we propose Consistency-enhanced Two-stage Segmentation (C2Seg). In the pretraining stage, Cluster-aware Contrastive Learning uses a frozen strong baseline to construct an intra-batch text similarity matrix as soft labels, thereby alleviating false negative conflicts and producing more discriminative visual representations.

论文默认配图 - 医学影像计算

论文详情

英文标题
Towards Text-Mask Consistency in Medical Image Segmentation
作者
Jie Gui, HangTu, Wen Sha, Xiuquan Du
期刊/会议
ICLR 2026 Poster
发表年份
2026 年
研究方向
clinical NLP