AI4Meder
返回论文列表
论文ICLR 2026 Poster2026 年trustworthy medical AI

大语言模型的医学可解释性与知识图谱

ICLR 2026 Poster accepted paper at ICLR 2026. We present a systematic study of medical-domain interpretability in Large Language Models (LLMs). We study how the LLMs both represent and process medical knowledge through four different interpretability techniques: (1) UMAP projections of intermediate activations, (2) gradient-based saliency with respect to the model weights, (3) layer lesioning/removal and (4) activation patching. We present knowledge maps of five LLMs which show, at a coarse-resolution, where knowledge about patient's ages, medical symptoms, diseases and drugs is stored in the models. In particular for Llama3.3-70B, we find that most medical knowledge is processed in the first half of the model's layers.

论文默认配图 - 医学影像计算

论文详情

英文标题
Medical Interpretability and Knowledge Maps of Large Language Models
作者
Razvan Marinescu, Victoria-Elisabeth Gruber, Diego Fajardo V.
期刊/会议
ICLR 2026 Poster
发表年份
2026 年
研究方向
trustworthy medical AI