论文ICLR 2026 Poster2026 年trustworthy medical AI

从对话到查询执行：EHR 数据库 Agent 的用户与工具交互基准

ICLR 2026 Poster accepted paper at ICLR 2026. Despite the impressive performance of LLM-powered agents, their adoption for Electronic Health Record (EHR) data access remains limited by the absence of benchmarks that adequately capture real-world clinical data access flows. In practice, two core challenges hinder deployment: query ambiguity from vague user questions and value mismatch between user terminology and database entries. To address this, we introduce EHR-ChatQA, an interactive database question answering benchmark that evaluates the end-to-end workflow of database agents: clarifying user questions, using tools to resolve value mismatches, and generating correct SQL to deliver accurate answers. To cover diverse patterns of query ambiguity and value mismatch, EHR-ChatQA assesses agents in a simulated environment with an LLM-based user across two interaction flows: Incremental Query Refinement (IncreQA), where users add constraints to existing queries, and Adaptive Query Refinement (AdaptQA), where users adjust their search goals mid-conversation. Code/project link: https://github.com/glee4810/EHR-ChatQA

医学影像计算临床语言智能 EHR 与临床预测论文 Database Agents LLM Agents EHR-QA DB-QA ICLR 2026 ICLR 2026 Poster medical_llm_agent

论文详情

英文标题: From Conversation to Query Execution: Benchmarking User and Tool Interactions for EHR Database Agents
作者: Gyubok Lee, Woosog Chay, Heeyoung Kwak, Yeong Hwa Kim, Haanju Yoo, Oksoon Jeong, Meong Hi Son, Edward Choi
期刊/会议: ICLR 2026 Poster
发表年份: 2026 年
研究方向: trustworthy medical AI