站内搜索 - AI4Meder

论文ICLR 2026 Poster2026 年trustworthy medical AI

Cancer-Myth：评估大语言模型回答含错误预设的患者问题

ICLR 2026 Poster accepted paper at ICLR 2026. Cancer patients are increasingly turning to large language models (LLMs) for medical information, making it critical to assess how well these models handle complex, personalized questions. However, current medical benchmarks focus on medical exams or consumer-searched questions and do not evaluate LLMs on real patient questions with patient details. In this paper, we first have three hematology-oncology physicians evaluate cancer-related questions drawn from real patients. While LLM responses are generally accurate, the models frequently fail to recognize or address false presuppositions} in the questions, posing risks to safe medical decision-making.

医学影像计算临床语言智能可信、安全、公平与隐私论文 Medical benchmark LLM evaluation 查看论文详情

搜索医学 AI 论文与资源

1 条结果

Cancer-Myth：评估大语言模型回答含错误预设的患者问题