论文ICLR 2026 Poster2026 年trustworthy medical AI Dyslexify:CLIP 中抵御排版攻击的机制性防御
ICLR 2026 Poster accepted paper at ICLR 2026. Typographic attacks exploit multi-modal systems by injecting text into images, leading to targeted misclassifications, malicious content generation and even Vision-Language Model jailbreaks. In this work, we analyze how CLIP vision encoders behave under typographic attacks, locating specialized attention heads in the latter half of the model's layers that causally extract and transmit typographic information to the cls token. Building on these insights, we introduce Dyslexify - a method to defend CLIP models against typographic attacks by selectively ablating a typographic circuit, consisting of attention heads. Without requiring finetuning, dyslexify improves performance by up to 22.06\% on a typographic variant of ImageNet-100, while reducing standard ImageNet-100 accuracy by less than 1\%, and demonstrate its utility in a medical foundation model for skin lesion diagnosis.
论文ICLR 2026 Poster2026 年trustworthy medical AI ATPO:面向多轮医学对话的自适应树策略优化
ICLR 2026 Poster accepted paper at ICLR 2026. Effective information seeking in multi-turn medical dialogues is critical for accurate diagnosis, especially when dealing with incomplete information. Aligning Large Language Models (LLMs) for these interactive scenarios is challenging due to the uncertainty inherent in user-agent interactions, which we formulate as a Hierarchical Markov Decision Process (H-MDP). While conventional Reinforcement Learning (RL) methods like Group Relative Policy Optimization (GRPO) struggle with long-horizon credit assignment and Proximal Policy Optimization (PPO) suffers from unstable value estimation in this context, we propose a novel uncertainty-aware Adaptive Tree Policy Optimization (ATPO) algorithm. Our method adaptively allocates the rollout budget to states with high uncertainty, quantified by a composite metric of Bellman error and action-value variance.
论文ICLR 2026 Poster2026 年trustworthy medical AI sleep2vec:异质夜间生理信号的统一跨模态对齐
ICLR 2026 Poster accepted paper at ICLR 2026. Tasks ranging from sleep staging to clinical diagnosis traditionally rely on standard polysomnography (PSG) devices, bedside monitors and wearable devices, which capture diverse nocturnal biosignals (e.g., EEG, EOG, ECG, SpO$_2$). However, heterogeneity across devices and frequent sensor dropout pose significant challenges for unified modelling of these multimodal signals. We present sleep2vec, a foundation model for diverse and incomplete nocturnal biosignals that learns a shared representation via cross-modal alignment. sleep2vec is contrastively pre-trained on 42,249 overnight recordings spanning nine modalities using a Demography, Age, Site & History-aware InfoNCE objective that incorporates physiological and acquisition metadata (e.g., age, gender, recording site) to dynamically weight negatives and mitigate cohort-specific shortcuts.
论文ICLR 2026 Poster2026 年surgical/interventional AI WavePolyp:基于层级小波特征聚合与帧间差异感知的视频息肉分割
ICLR 2026 Poster accepted paper at ICLR 2026. Automatic polyp segmentation from colonoscopy videos is a crucial technique that assists clinicians in improving the accuracy and efficiency of diagnosis, preventing polyps from developing into cancer. However, video polyp segmentation (VPS) is a challenging task due to (1) the significant inter-frame divergence in videos, (2) the high camouflage of polyps in normal colon structures and (3) the clinical requirement of real-time performance. In this paper, we propose a novel segmentation network, WavePolyp, which consists of two innovative components: a hierarchical wavelet-based feature aggregation (HWFA) module and inter-frame divergence perception (IDP) blocks. Specifically, HWFA excavates and amplifies discriminative information from high-frequency and low-frequency features decomposed by wavelet transform, hierarchically aggregating them into refined spatial representations within each frame. Code/project link: https://github.com/FishballZhang/WavePolyp
论文ICLR 2026 Poster2026 年trustworthy medical AI 随机锚点与低秩去相关学习:类增量医学图像分类的极简流程
ICLR 2026 Poster accepted paper at ICLR 2026. Class-incremental learning (CIL) in medical image-guided diagnosis requires models to preserve knowledge of historical disease classes while adapting to emerging categories. Pre-trained models (PTMs) with well-generalized features provide a strong foundation, yet most PTM-based CIL strategies, such as prompt tuning, task-specific adapters and model mixtures, rely on increasingly complex designs. While effective in general-domain benchmarks, these methods falter in medical imaging, where low intra-class variability and high inter-domain shifts (from scanners, protocols and institutions) make CIL particularly prone to representation collapse and domain misalignment. Under such conditions, we find that lightweight representation calibration strategies, often dismissed in general-domain CIL for their modest gains, can be remarkably effective for adapting PTMs in medical settings.
论文ICLR 2026 Poster2026 年trustworthy medical AI 基于强化学习的假设驱动临床决策语言 Agent
ICLR 2026 Poster accepted paper at ICLR 2026. Clinical decision-making is a dynamic, interactive, and cyclic process where doctors have to repeatedly decide on which clinical action to perform and consider newly uncovered information for diagnosis and treatment. Large Language Models (LLMs) have the potential to support clinicians in this process, however, most applications of LLMs in clinical decision support suffer from one of two limitations: Either they assume the unrealistic scenario of immediate availability of all patient information and do not model the interactive and iterative investigation process, or they restrict themselves to the limited "out-of-the-box" capabilities of large pre-trained models without performing task-specific training. In contrast to this, we propose to model clinical decision-making for diagnosis with a hypothesis-driven uncertainty-aware language agent, LA-CDM, that converges towards a diagnosis via repeatedly requesting and interpreting relevant tests. Using a hybrid training paradigm combining supervised and reinforcement learning, we train LA-CDM with three objectives targeting critical aspects of clinical decision-making: accurate hypothesis generation, hypothesis uncertainty estimation, and efficient decision-making. Code/project link: https://github.com/dharouni/LA-CDM
论文ICLR 2026 Poster2026 年trustworthy medical AI 单模态基础模型的联合适配用于多模态阿尔茨海默病诊断
ICLR 2026 Poster accepted paper at ICLR 2026. Alzheimer’s Disease (AD) is a progressive neurodegenerative disorder and a leading cause of dementia worldwide. Accurate diagnosis requires integrating diverse patient data modalities. With the rapid advancement of foundation models in neurobiology and medicine, integrating foundation models from various modalities has emerged as a promising yet underexplored direction for multi-modal AD diagnosis. A central challenge is enabling effective interaction among these models without disrupting the robust, modality-specific representations learned from large-scale pretraining. To address this, we propose a novel multi-modal framework for AD diagnosis that enables joint interaction among uni-modal foundation models through modality-anchored interaction.
论文ICLR 2026 Poster2026 年clinical prediction 用跨切片一致随机性改进 3D 医学影像的 2D 扩散模型
ICLR 2026 Poster accepted paper at ICLR 2026. 3D medical imaging is in high demand and essential for clinical diagnosis and scientific research. Currently, diffusion models have become an effective tool for medical imaging reconstruction thanks to their ability to learn rich, high‑quality data priors. However, learning the 3D data distribution with diffusion models in medical imaging is challenging, not only due to the difficulties in data collection but also because of the significant computational burden during model training. A common compromise is to train the diffusion model on 2D data priors and reconstruct stacked 2D slices to address 3D medical inverse problems. Code/project link: https://github.com/duchenhe/ISCS
论文ICLR 2026 Poster2026 年trustworthy medical AI Critic-Adviser-Reviser 循环精炼:迈向高质量 EMR 语料生成
ICLR 2026 Poster accepted paper at ICLR 2026. Electronic medical records (EMRs) are vital for healthcare research, but their use is limited by privacy concerns. Synthetic EMR generation offers a promising alternative, yet most existing methods merely imitate real records without adhering to rigorous clinical quality principles. To address this, we introduce LLM-CARe, a stage-wise cyclic refinement framework that progressively improves EMR quality through three stages, each targeting a specific granularity: corpus, section and document. At each stage, a Critic, an Adviser, and a Reviser collaborate iteratively to evaluate, provide feedback, and refine the drafts.
论文ICLR 2026 Poster2026 年trustworthy medical AI Nef-Net v2:野外场景下适配 Electrocardio Panorama
ICLR 2026 Poster accepted paper at ICLR 2026. Conventional multi-lead electrocardiogram (ECG) systems capture cardiac signals from a fixed set of anatomical viewpoints defined by lead placement. However, cer- tain cardiac conditions (e.g., Brugada syndrome) require additional, non-standard viewpoints to reveal diagnostically critical patterns that may be absent in standard leads. To systematically overcome this limitation, Nef-Net was recently introduced to reconstruct a continuous electrocardiac field, enabling virtual observation of ECG signals from arbitrary views (termed Electrocardio Panorama). Despite its promise, Nef-Net operates under idealized assumptions and faces in-the-wild challenges, such as long-duration ECG modeling, robustness to device-specific signal artifacts, and suboptimal lead placement calibration. Code/project link: https://github.com/HKUSTGZ-ML4Health-Lab/NEFNET-v2
论文ICLR 2026 Poster2026 年trustworthy medical AI Resp-Agent:面向多模态呼吸音生成与疾病诊断的 Agent 系统
ICLR 2026 Poster accepted paper at ICLR 2026. Deep learning-based respiratory auscultation is currently hindered by two fundamental challenges: (i) inherent information loss, as converting signals into spectrograms discards transient acoustic events and clinical context; (ii) limited data availability, exacerbated by severe class imbalance. To bridge these gaps, we present **_Resp-Agent_**, an autonomous multimodal system orchestrated by a novel Active Adversarial Curriculum Agent (Thinker-A²CA). Unlike static pipelines, Thinker-A²CA serves as a central controller that actively identifies diagnostic weaknesses and schedules targeted synthesis in a closed loop. To address the representation gap, we introduce a modality-weaving Diagnoser that weaves clinical text with audio tokens via strategic global attention and sparse audio anchors, capturing both long-range clinical context and millisecond-level transients. Code/project link: https://github.com/zpforlove/Resp-Agent
论文ICLR 2026 Poster2026 年clinical prediction M3CoTBench:医学图像理解中 MLLM 思维链基准
ICLR 2026 Poster accepted paper at ICLR 2026. Chain-of-Thought (CoT) reasoning has proven effective in enhancing large language models by encouraging step-by-step intermediate reasoning, and recent advances have extended this paradigm to Multimodal Large Language Models (MLLMs). In the medical domain, where diagnostic decisions depend on nuanced visual cues and sequential reasoning, CoT aligns naturally with clinical thinking processes. However, current benchmarks for medical image understanding generally focus on the final answer while ignoring the reasoning path. An opaque process lacks reliable bases for judgment, making it difficult to assist doctors in diagnosis.
论文ICLR 2026 Poster2026 年clinical prediction 利用潜在流匹配学习患者特异疾病动力学用于纵向影像生成
ICLR 2026 Poster accepted paper at ICLR 2026. Understanding disease progression is a central clinical challenge with direct implications for early diagnosis and personalized treatment. While recent generative approaches have attempted to model progression, key mismatches remain: disease dynamics are inherently continuous and monotonic, yet latent representations are often scattered, lacking semantic structure, and diffusion-based models disrupt continuity through the random denoising process. In this work, we propose treating disease dynamics as a velocity field and leveraging Flow Matching (FM) to align the temporal evolution of patient data. Unlike prior methods, our approach captures the intrinsic dynamics of disease, making progression more interpretable.
论文ICLR 2026 Poster2026 年clinical prediction FETAL-GAUGE:评估胎儿超声视觉语言模型的基准
ICLR 2026 Poster accepted paper at ICLR 2026. The growing demand for prenatal ultrasound imaging has intensified a global shortage of trained sonographers, creating barriers to essential fetal health monitoring. Deep learning has the potential to enhance sonographers' efficiency and support the training of new practitioners. Vision-Language Models (VLMs) are particularly promising for ultrasound interpretation, as they can jointly process images and text to perform multiple clinical tasks within a single framework. However, despite the expansion of VLMs, no standardized benchmark exists to evaluate their performance in fetal ultrasound imaging. Code/project link: https://github.com/BioMedIA-MBZUAI/FETAL-GAUGE
论文ICLR 2026 Poster2026 年clinical NLP 多图像医学思维
ICLR 2026 Poster accepted paper at ICLR 2026. Large language models perform well on many medical QA benchmarks, but real clinical reasoning is harder because diagnosis often requires integrating evidence across multiple images rather than interpreting a single view. We introduce MedThinkVQA, an expert-annotated benchmark for thinking with multiple images, in which models must interpret each image, combine cross-view evidence, and solve diagnostic questions under intermediate supervision and step-level evaluation. The dataset contains 10,067 cases, including 720 test cases, with an average of 6.68 images per case, substantially denser than prior work (earlier maxima $\leq$ 1.43). On the test set, the best closed-source models, Claude-4.6-opus, Gemini-3-pro, and GPT-5.2-xhigh, achieve only 54.9%--57.2% accuracy, while smaller proprietary variants, GPT-5-mini/nano, drop to 39.7% and 30.8%.
论文ICLR 2026 Poster2026 年trustworthy medical AI AttTok:将属性 token 与生成式预训练视觉语言模型结合用于医学图像理解
ICLR 2026 Poster accepted paper at ICLR 2026. Recent generative pre-trained vision–language (GPTv) models have achieved remarkable success in multi-modal understanding, inspiring their adaptation to medical imaging tasks such as disease diagnosis and visual question answering (VQA). However, current instruction-tuned GPTv models suffer from two key challenges: (1) medical attributes (e.g., disease names, severity grades) are encoded as plain text tokens, collapsing semantically distinct concepts into nearly identical textual sequences; and (2) inadequate textual supervision weakens visual representation learning, leading to severe inter-attribute confusion and misaligned vision–language embeddings. To address these limitations, we introduce attribute tokens (AttTok), a set of pre‑defined special tokens that uniquely encode clinical attributes (e.g., imaging modality, diagnosis, severity) within a structured token space. Complemented by attribute‑centric embedding books, AttTok serves as anchor points for aligning both visual and textual modalities into a shared, discriminative representation space.
论文ICLR 2026 Oral2026 年clinical prediction 去中心化注意力错失中心信号:重新思考医学时间序列 Transformer
ICLR 2026 Oral accepted paper at ICLR 2026. Accurate analysis of Medical time series (MedTS) data, such as Electroencephalography (EEG) and Electrocardiography (ECG), plays a pivotal role in healthcare applications, including the diagnosis of brain and heart diseases. MedTS data typically exhibits two critical patterns: **temporal dependencies** within individual channels and **channel dependencies** across multiple channels. While recent advances in deep learning have leveraged Transformer-based models to effectively capture temporal dependencies, they often struggle to model channel dependencies. This limitation stems from a structural mismatch: ***MedTS signals are inherently centralized, whereas the Transformer's attention is decentralized***, making it less effective at capturing global synchronization and unified waveform patterns. Code/project link: https://github.com/Levi-Ackman/TeCh
论文ICLR 2026 Poster2026 年clinical prediction 能否用 LLM 为临床时间序列数据生成可迁移表征?
ICLR 2026 Poster accepted paper at ICLR 2026. Recent advances in vision-language models (VLMs) have achieved remarkable performance on standard medical benchmarks, yet their true clinical reasoning ability remains unclear. Existing datasets predominantly emphasize classification accuracy, creating an evaluation illusion in which models appear proficient while still failing at high-stakes diagnostic reasoning. We introduce Neural-MedBench, a compact yet reasoning-intensive benchmark specifically designed to probe the limits of multimodal clinical reasoning in neurology. Neural-MedBench integrates multi-sequence MRI scans, structured electronic health records, and clinical notes, and encompasses three core task families: differential diagnosis, lesion recognition, and rationale generation. Code/project link: https://neuromedbench.github.io/
论文ICLR 2026 Poster2026 年trustworthy medical AI 超越分类准确率:Neural-MedBench 与深层推理基准的必要性
ICLR 2026 Poster accepted paper at ICLR 2026. Epilepsy affects over 50 million people worldwide, and one-third of patients suffer drug-resistant seizures where surgery offers the best chance of seizure freedom. Accurate localization of the epileptogenic zone (EZ) relies on intracranial EEG (iEEG). Clinical workflows, however, remain constrained by labor-intensive manual review. At the same time, existing data-driven approaches are typically developed on single-center datasets that are inconsistent in format and metadata, lack standardized benchmarks, and rarely release pathological event annotations, creating barriers to reproducibility, cross-center validation, and clinical relevance. Code/project link: https://omni-ieeg.github.io/omni-ieeg/; https://github.com/Omni-iEEG/Omni-iEEG
论文ICLR 2026 Poster2026 年trustworthy medical AI MedAgent-Pro:通过推理型 Agent 工作流迈向证据型多模态医学诊断
ICLR 2026 Poster accepted paper at ICLR 2026. Modern clinical diagnosis relies on the comprehensive analysis of multi-modal patient data, drawing on medical expertise to ensure systematic and rigorous reasoning. Recent advances in Vision–Language Models (VLMs) and agent-based methods are reshaping medical diagnosis by effectively integrating multi-modal information. However, they often output direct answers and empirical-driven conclusions without clinical evidence supported by quantitative analysis, which compromises their reliability and hinders clinical usability. Here we propose MedAgent-Pro, an agentic reasoning paradigm that mirrors modern diagnosis principles via a hierarchical diagnostic workflow, consisting of disease-level standardized plan generation and patient-level personalized step-by-step reasoning.
论文ICLR 2026 Poster2026 年trustworthy medical AI 超越医学考试:面向心理健康真实任务与模糊性的临床医生标注公平性数据集
ICLR 2026 Poster accepted paper at ICLR 2026. Current medical language model (LM) benchmarks often over-simplify the complexities of day-to-day clinical practice tasks and instead rely on evaluating LMs on multiple-choice board exam questions. In psychiatry especially, these challenges are worsened by fairness and bias issues, since models can be swayed by patient demographics even when those factors should not influence clinical decisions. Thus, we present an expert-created and annotated dataset spanning five critical domains of decision-making in mental healthcare: treatment, diagnosis, documentation, monitoring, and triage. This U.S. centric dataset — created without any LM assistance — is designed to capture the nuanced clinical reasoning and daily ambiguities mental health practitioners encounter, reflecting the inherent complexities of care delivery that are missing from existing datasets.
论文ICLR 2026 Poster2026 年clinical prediction 学习自我批判机制用于区域引导胸部 X 光报告生成
ICLR 2026 Poster accepted paper at ICLR 2026. Automatic radiology reporting assists radiologists in diagnosing abnormalities in radiology images, where grounding the automatic diagnosis with abnormality locations is important for the report interpretability. However, existing supervised-learning methods could lead to learning the superficial statistical correlations between images and reports, lacking multi-faceted reasoning to critique the relevant regions on which radiologists would focus. Recently, self-critical reasoning has been investigated in test-time scaling approaches to alleviate hallucinations of LLMs with increased time complexity. In this work, we focus on chest X-ray report generation with particular focus on clinical accuracy, where self-critical reasoning is alternatively introduced into the model architecture and their training objective, preferred by the real-time automatic reporting system.
论文ICLR 2026 Poster2026 年clinical prediction 通过概念型多模态协同适配桥接放射学与病理学基础模型
ICLR 2026 Poster accepted paper at ICLR 2026. Pretrained medical foundation models (FMs) have shown strong generalization across diverse imaging tasks, such as disease classification in radiology and tumor grading in histopathology. While recent advances in parameter-efficient finetuning have enabled effective adaptation of FMs to downstream tasks, these approaches are typically designed for a single modality. In contrast, many clinical workflows rely on joint diagnosis from heterogeneous domains, such as radiology and pathology, where fully leveraging the representation capacity of multiple FMs remains an open challenge. To address this gap, we propose Concept Tuning and Fusing (CTF), a parameter-efficient framework that uses clinically grounded concepts as a shared semantic interface to enable cross-modal co-adaptation before fusion. Code/project link: https://github.com/HKU-MedAI/CTF; https://github.com/neuronflow/BraTS-Toolkit
论文ICLR 2026 Poster2026 年trustworthy medical AI AbdCTBench:从腹部表面几何学习临床生物标志物表征
ICLR 2026 Poster accepted paper at ICLR 2026. Body composition analysis through CT and MRI imaging provides critical insights for cardio-metabolic health assessment but remains limited by accessibility barriers including radiation exposure, high costs, and infrastructure requirements. We present AbdCTBench, a large-scale dataset containing 23,506 CT-derived abdominal surface meshes from 18,719 patients, paired with 87 comorbidity labels, 31 specific diagnosis codes, and 16 CT-derived biomarkers. Our key insight is that external surface geometry is predictive of internal tissue composition, enabling accessible health screening through consumer devices. We establish comprehensive benchmarks across seven computer vision architectures (ResNet-18/34/50, DenseNet-121, EfficientNet-B0, ViT-Small, Swin Transformer-Base), demonstrating that models can learn robust surface-to-biomarker representations directly from 2D mesh projections. Code/project link: https://abdctbenchrepo.github.io/AbdCTBench/
论文ICLR 2026 Poster2026 年Medical multimodal AI AttTok:将属性 token 与生成式预训练视觉语言模型结合用于医学图像理解
ICLR 2026 poster introducing AttTok, a medical vision-language method that uses predefined attribute tokens and attribute-centric mechanisms to improve medical image understanding, including classification and visual question answering.
论文ICLR 2026 Poster2026 年医疗大模型与 Agent MedAgent-Pro:通过推理型 Agent 工作流迈向证据型多模态医学诊断
ICLR 2026 Poster 论文,提出 MedAgent-Pro:一种面向证据型多模态医学诊断的推理 Agent 工作流。该方法围绕疾病层面的标准化计划生成与患者层面的个性化逐步推理,结合检索增强生成、医学指南对齐、视觉模型等专业工具与证据型反思机制,服务于更可靠、可解释的医学诊断推理。
数据资源cine cardiac MRI with segmentation labelscardiac MRI segmentation datasetACDC challenge dataset; see official database page申请访问 ACDC 自动心脏诊断挑战数据集
ACDC is a cardiac MRI dataset for automated cardiac diagnosis and segmentation. It supports left and right ventricular segmentation, myocardium segmentation, cardiac function quantification, and evaluation of robust cardiac image analysis methods.
数据资源MRI, PET, biomarkers, clinical and cognitive assessmentslongitudinal neuroimaging and clinical datasetLongitudinal ADNI cohort data; access through ADNI/LONI申请访问 ADNI 阿尔茨海默病神经影像倡议数据集
ADNI provides longitudinal neuroimaging, biomarker, clinical, and cognitive data for Alzheimer disease research. It supports disease progression modeling, dementia diagnosis, multimodal prediction, biomarker discovery, and clinical translation studies.
数据资源thoracic CT images with nodule annotationslung CT nodule datasetTCIA LIDC-IDRI collection开放访问 LIDC-IDRI 肺部 CT 结节数据集
LIDC-IDRI is a lung CT dataset with thoracic CT scans and expert nodule annotations. It is a classic benchmark for lung nodule detection, segmentation, malignancy characterization, radiomics, and computer-aided diagnosis research.
征稿与合作npj Digital Medicine截止 北京时间 2026-07-21期刊专刊 npj Digital Medicine 专辑:运动医学中的人工智能
This Nature Portfolio / npj Digital Medicine collection is open for submissions until 2026-07-21. It invites research on AI in sports medicine, including multimodal injury and medical-condition prediction, individualized diagnosis, treatment and rehabilitation, transparent and diverse datasets, open-source explainable AI, and safe AI systems for athlete and exercise health.
征稿与合作Frontiers in Artificial Intelligence / Frontiers Research Topic截止 北京时间 2026-09-14期刊专刊 Frontiers Research Topic:临床决策中的多组学整合
This Frontiers Research Topic calls for work on integrating multi-omics data with clinical information to improve diagnosis, prognosis, and personalized treatment. The page lists a manuscript deadline of 2026-09-14 and is currently accepting articles, making it a relevant journal CFP for clinical translation, multimodal medical AI, and precision medicine.
征稿与合作MICAD 2026截止 北京时间 2026-07-21 19:59会议征稿 MICAD 2026 征稿
Medical Imaging and Computer-Aided Diagnosis 2026 call for full papers, posters, and oral presentations.
MIT OpenCourseWare:医疗机器学习
MIT OCW 6.S897 Machine Learning for Healthcare introduces clinical data and machine learning for risk stratification, disease progression modeling, precision medicine, diagnosis, subtype discovery, medical imaging, public health, and clinical workflow improvement.
AI for Medicine 专项课程
DeepLearning.AI specialization on diagnosis, prognosis, and treatment using medical AI workflows.