论文ICLR 2026 Poster2026 年trustworthy medical AI Dyslexify:CLIP 中抵御排版攻击的机制性防御
ICLR 2026 Poster accepted paper at ICLR 2026. Typographic attacks exploit multi-modal systems by injecting text into images, leading to targeted misclassifications, malicious content generation and even Vision-Language Model jailbreaks. In this work, we analyze how CLIP vision encoders behave under typographic attacks, locating specialized attention heads in the latter half of the model's layers that causally extract and transmit typographic information to the cls token. Building on these insights, we introduce Dyslexify - a method to defend CLIP models against typographic attacks by selectively ablating a typographic circuit, consisting of attention heads. Without requiring finetuning, dyslexify improves performance by up to 22.06\% on a typographic variant of ImageNet-100, while reducing standard ImageNet-100 accuracy by less than 1\%, and demonstrate its utility in a medical foundation model for skin lesion diagnosis.
论文ICLR 2026 Poster2026 年trustworthy medical AI COMPASS:医学分割指标的鲁棒特征保形预测
ICLR 2026 Poster accepted paper at ICLR 2026. In clinical applications, the utility of segmentation models is often based on the accuracy of derived downstream metrics such as organ size, rather than by the pixel-level accuracy of the segmentation masks themselves. Thus, uncertainty quantification for such metrics is crucial for decision-making. Conformal prediction (CP) is a popular framework to derive such principled uncertainty guarantees, but applying CP naively to the final scalar metric is inefficient because it treats the complex, non-linear segmentation-to-metric pipeline as a black box. We introduce COMPASS, a practical framework that generates efficient, metric-based CP intervals for image segmentation models by leveraging the inductive biases of their underlying deep neural networks.
数据资源dermoscopic and clinical skin lesion imagesdermatology image archiveLarge public ISIC dermatology image archive开放访问 ISIC Archive 皮肤病学图像数据集
The ISIC Archive is a large public dermatology image repository for skin lesion analysis. It is widely used for melanoma classification, lesion segmentation, dermoscopic image retrieval, bias and domain shift analysis, and clinical imaging benchmark development.