论文ICLR 2026 Oral2026 年clinical prediction CounselBench:心理健康问答中大语言模型的大规模专家评测与对抗基准
ICLR 2026 Oral accepted paper at ICLR 2026. Medical question answering (QA) benchmarks often focus on multiple-choice or fact-based tasks, leaving open-ended answers to real patient questions underexplored. This gap is particularly critical in mental health, where patient questions often mix symptoms, treatment concerns, and emotional needs, requiring answers that balance clinical caution with contextual sensitivity. We present CounselBench, a large-scale benchmark developed with 100 mental health professionals to evaluate and stress-test large language models (LLMs) in realistic help-seeking scenarios. The first component, CounselBench-EVAL, contains 2,000 expert evaluations of answers from GPT-4, LLaMA 3, Gemini, and online human therapists on patient questions from the public forum CounselChat.
论文ICLR 2026 Poster2026 年trustworthy medical AI 超越医学考试:面向心理健康真实任务与模糊性的临床医生标注公平性数据集
ICLR 2026 Poster accepted paper at ICLR 2026. Current medical language model (LM) benchmarks often over-simplify the complexities of day-to-day clinical practice tasks and instead rely on evaluating LMs on multiple-choice board exam questions. In psychiatry especially, these challenges are worsened by fairness and bias issues, since models can be swayed by patient demographics even when those factors should not influence clinical decisions. Thus, we present an expert-created and annotated dataset spanning five critical domains of decision-making in mental healthcare: treatment, diagnosis, documentation, monitoring, and triage. This U.S. centric dataset — created without any LM assistance — is designed to capture the nuanced clinical reasoning and daily ambiguities mental health practitioners encounter, reflecting the inherent complexities of care delivery that are missing from existing datasets.
论文ICLR 2026 Poster2026 年可信、安全、公平与隐私 超越医学考试:面向心理健康真实任务与模糊性的临床医生标注公平性数据集
ICLR 2026 Poster 论文提出 MENTAT:一个由临床专家创建和标注、面向心理健康真实任务与模糊性的公平性评测数据集,用于评估语言模型在临床决策任务中的表现与偏差。
征稿与合作npj Digital Medicine截止 北京时间 2026-06-01期刊专刊 npj Digital Medicine 专辑:心理健康中的 AI 赋能疗法
This Nature Portfolio / npj Digital Medicine collection is open for submissions until 2026-06-01. It focuses on AI tools that support or deliver therapeutic interventions in mental health, including generative AI therapy bots, reinforcement-learning agents, human-in-the-loop models, clinical validity, safety, ethics, equity, and regulation.