AI4Meder
返回论文列表
论文ICLR 2026 Poster2026 年medical LLM agent

大语言模型能否匹配系统综述的结论?

ICLR 2026 Poster accepted paper at ICLR 2026. Systematic reviews (SR), in which experts summarize and analyze evidence across individual studies to provide insights on a specialized topic, are a cornerstone for evidence-based clinical decision-making, research, and policy. Given the exponential growth of scientific articles, there is growing interest in using large language models (LLMs) to automate SR generation. However, the ability of LLMs to critically assess evidence and reason across multiple documents to provide recommendations at the same proficiency as domain experts remains poorly characterized. We therefore ask: **Can LLMs match the conclusions of systematic reviews written by clinical experts when given access to the same studies?** To explore this question, we present MedEvidence, a benchmark pairing findings from 100 medical SRs with the studies they are based on.

论文默认配图 - 医学影像计算

论文详情

英文标题
Can Large Language Models Match the Conclusions of Systematic Reviews?
作者
Christopher Polzak, Alejandro Lozano, Min Woo Sun, James Burgess, Yuhui Zhang, Kevin Wu, Chia-Chun Chiang, Jeffrey J Nirschl, Serena Yeung-Levy
期刊/会议
ICLR 2026 Poster
发表年份
2026 年
研究方向
medical LLM agent