论文ICLR 2026 Poster2026 年trustworthy medical AI

可解释性与嵌入的桥接：让 BEE 识别伪相关

ICLR 2026 Poster accepted paper at ICLR 2026. Current methods for detecting spurious correlations rely on data splits or error patterns, leaving many harmful shortcuts invisible when counterexamples are absent. We introduce BEE (Bridging Explainability and Embeddings), a framework that shifts the focus from model predictions to the weight space and embedding geometry underlying decisions. By analyzing how fine-tuning perturbs pretrained representations, BEE uncovers spurious correlations that remain hidden from conventional evaluation pipelines. We use linear probing as a transparent diagnostic lens, revealing spurious features that not only persist after full fine-tuning but also transfer across diverse state-of-the-art models. Code/project link: https://github.com/bit-ml/bee

医疗多模态临床语言智能可信、安全、公平与隐私论文 spurious correlation interpretability clip foundation models ICLR 2026 ICLR 2026 Poster remaining batch clinical_translation

论文详情

英文标题: Bridging Explainability and Embeddings: BEE Aware of Spuriousness
作者: Cristian Daniel Paduraru, Antonio Barbalau, Radu Filipescu, Andrei Liviu Nicolicioiu, Elena Burceanu
期刊/会议: ICLR 2026 Poster
发表年份: 2026 年
研究方向: trustworthy medical AI