AI4Meder 站内搜索

搜索医学 AI 论文与资源

按论文、数据资源、技术竞赛、投稿截止日期和课程资源检索社区内容，快速进入对应详情页。

126 条结果

输入关键词或点击标签，按论文、数据资源、竞赛截止日期、征稿与课程缩小范围。标签：Dataset

论文ICLR 2026 Poster2026 年trustworthy medical AI

可解释性与嵌入的桥接：让 BEE 识别伪相关

ICLR 2026 Poster accepted paper at ICLR 2026. Current methods for detecting spurious correlations rely on data splits or error patterns, leaving many harmful shortcuts invisible when counterexamples are absent. We introduce BEE (Bridging Explainability and Embeddings), a framework that shifts the focus from model predictions to the weight space and embedding geometry underlying decisions. By analyzing how fine-tuning perturbs pretrained representations, BEE uncovers spurious correlations that remain hidden from conventional evaluation pipelines. We use linear probing as a transparent diagnostic lens, revealing spurious features that not only persist after full fine-tuning but also transfer across diverse state-of-the-art models. Code/project link: https://github.com/bit-ml/bee

医疗多模态临床语言智能可信、安全、公平与隐私论文 spurious correlation interpretability 查看论文详情

论文ICLR 2026 Poster2026 年clinical NLP

VLM-SubtleBench：VLM 距离人类级细微比较推理还有多远？

ICLR 2026 Poster accepted paper at ICLR 2026. The ability to distinguish subtle differences between visually similar images is essential for diverse domains such as industrial anomaly detection, medical imaging, and aerial surveillance. While comparative reasoning benchmarks for vision-language models (VLMs) have recently emerged, they primarily focus on images with large, salient differences and fail to capture the nuanced reasoning required for real-world applications. In this work, we introduce **VLM-SubtleBench**, a benchmark designed to evaluate VLMs on *subtle comparative reasoning*. Our benchmark covers ten difference types—Attribute, State, Emotion, Temporal, Spatial, Existence, Quantity, Quality, Viewpoint, and Action—and curate paired question–image sets reflecting these fine-grained variations.

医学影像计算医疗多模态临床语言智能论文 Vision-language Models Multimodal Large Language Models 查看论文详情

论文ICLR 2026 Poster2026 年医学影像

CARL：面向光谱图像分析的相机无关表征学习

ICLR 2026 Poster accepted paper at ICLR 2026. Spectral imaging offers promising applications across diverse domains, including medicine and urban scene understanding, and is already established as a critical modality in remote sensing. However, variability in channel dimensionality and captured wavelengths among spectral cameras impede the development of AI-driven methodologies, leading to camera-specific models with limited generalizability and inadequate cross-camera applicability. To address this bottleneck, we introduce CARL, a model for Camera-Agnostic Representation Learning across RGB, multispectral, and hyperspectral imaging modalities. To enable the conversion of a spectral image with any channel dimensionality to a camera-agnostic representation, we introduce a novel spectral encoder, featuring a self-attention-cross-attention mechanism, to distill salient spectral information into learned spectral representations. Code/project link: https://github.com/IMSY-DKFZ/CARL

医学影像计算论文 Representation Learning Self-Supervised Learning Spectral Imaging ICLR 2026 查看论文详情

论文ICLR 2026 Poster2026 年医学影像

无需甲基化输入的全基因组 DNA 甲基化预测新范式

ICLR 2026 Poster accepted paper at ICLR 2026. DNA methylation (DNAm) is a key epigenetic modification that regulates gene expression and is pivotal in development and disease. However, profiling DNAm at genome scale is challenging: of $\textasciitilde$28 million CpG sites in the human genome, only about 1–3\% are typically assayed in common datasets due to technological limitations and cost. Recent deep learning approaches, including masking-based generative Transformer models, have shown promise in capturing DNAm–gene expression relationships, but they rely on partially observed DNAm values for unmeasured CpGs and cannot be applied to completely unmeasured samples. To overcome this barrier, we introduce MethylProphet, a gene-guided, context-aware Transformer model for whole-genome DNAm inference without any measured DNAm input.

医学影像计算论文 DNA Methylation Deep Learning Genome ICLR 2026 查看论文详情

论文ICLR 2026 Poster2026 年trustworthy medical AI

面向垂直联邦学习的隐私保障标签遗忘：无需披露的少样本遗忘

ICLR 2026 Poster accepted paper at ICLR 2026. This paper addresses the critical challenge of unlearning in Vertical Federated Learning (VFL), a setting that has received far less attention than its horizontal counterpart. Specifically, we propose the first method tailored to *label unlearning* in VFL, where labels play a dual role as both essential inputs and sensitive information. To this end, we employ a representation-level manifold mixup mechanism to generate synthetic embeddings for both unlearned and retained samples. This is to provide richer signals for the subsequent gradient-based label forgetting and recovery steps. These augmented embeddings are then subjected to gradient-based label forgetting, effectively removing the associated label information from the model. Code/project link: https://github.com/bryanhx/Towards-Privacy-Guaranteed-Label-Unlearning-in-Vertical-Federated-Learning

医学影像计算可信、安全、公平与隐私论文 Federated Learning Machine Unlearning Privacy-Preserving 查看论文详情

论文ICLR 2026 Poster2026 年医学影像

Reconstruct Anything Model：面向计算成像的轻量级通用模型

ICLR 2026 Poster accepted paper at ICLR 2026. Most existing learning-based methods for solving imaging inverse problems can be roughly divided into two classes: iterative algorithms, such as plug-and-play and diffusion methods leveraging pretrained denoisers, and unrolled architectures that are trained end-to-end for specific imaging problems. Iterative methods in the first class are computationally costly and often yield suboptimal reconstruction performance, whereas unrolled architectures are generally problem-specific and require expensive training. In this work, we propose a novel non-iterative, lightweight architecture that incorporates knowledge about the forward operator (acquisition physics and noise parameters) without relying on unrolling. Our model is trained to solve a wide range of inverse problems, such as deblurring, magnetic resonance imaging, computed tomography, inpainting, and super-resolution, and handles arbitrary image sizes and channels, such as grayscale, complex, and color data. Code/project link: https://github.com/matthieutrs/ram

医学影像计算论文 computational imaging deep learning self-supervised learning foundation models 查看论文详情

论文ICLR 2026 Oral2026 年medical multimodal

面向多模态 GigaVoxel 图像配准的可扩展分布式框架

ICLR 2026 Oral accepted paper at ICLR 2026. In this work, we propose FFDP, a set of IO-aware non-GEMM fused kernels supplemented with a distributed framework for image registration at unprecedented scales. Image registration is an inverse problem fundamental to biomedical and life sciences, but algorithms have not scaled in tandem with image acquisition capabilities. Our framework complements existing model parallelism techniques proposed for large-scale transformer training by optimizing non-GEMM bottlenecks and enabling convolution-aware tensor sharding. We demonstrate unprecedented capabilities by performing multimodal registration of a 100μm ex-vivo human brain MRI volume at native resolution – an inverse problem more than 570× larger than a standard clinical datum in about a minute using only 8 A6000 GPUs.

医学影像计算医疗多模态论文 image registration distributed optimization CUDA kernels 查看论文详情

论文ICLR 2026 Poster2026 年医学影像

分布一致性损失：超越反问题中的逐点数据项

ICLR 2026 Poster accepted paper at ICLR 2026. Recovering true signals from noisy measurements is a central challenge in inverse problems spanning medical imaging, geophysics, and signal processing. Current solutions nearly always balance prior assumptions regarding the true signal (regularization) with agreement to noisy measured data (data-fidelity). Conventional data-fidelity loss functions, such as mean-squared error (MSE) or negative log-likelihood, seek pointwise agreement with noisy measurements, often leading to overfitting to noise. In this work, we instead evaluate data-fidelity collectively by testing whether the observed measurements are statistically consistent with the noise distributions implied by the current estimate.

医学影像计算论文 Inverse problems data fidelity denoising image reconstruction 查看论文详情

论文ICLR 2026 Poster2026 年医学影像

MnemoDyn：从 4 万条 fMRI 序列学习静息态动力学

ICLR 2026 Poster accepted paper at ICLR 2026. We present a dynamical-systems based model for resting-state functional magnetic resonance imaging (rs-fMRI), trained on a dataset of roughly $40$K rs-fMRI sequences covering a wide variety of public and available-by-permission datasets. While most existing proposals use transformer backbones, we utilize multi-resolution temporal modeling of the dynamics across parcellated brain regions. We show that MnemoDyn is compute efficient and generalizes very well across diverse populations and scanning protocols. When benchmarked against current state-of-the-art transformer-based approaches, MnemoDyn consistently delivers superior reconstruction quality.

医学影像计算论文 Dynamical system Brain Imaging ICLR 2026 ICLR 2026 Poster 查看论文详情

论文ICLR 2026 Poster2026 年trustworthy medical AI

先验感知与上下文引导的主动概率子采样分组

ICLR 2026 Poster accepted paper at ICLR 2026. Subsampling significantly reduces the number of measurements, thereby streamlining data processing and transfer overhead, and shortening acquisition time across diverse real-world applications. The recently introduced Active Deep Probabilistic Subsampling (A-DPS) approach jointly optimizes both the subsampling pattern and the downstream task model, enabling instance- and subject-specific sampling trajectories and effective adaptation to new data at inference time. However, this approach does not fully leverage valuable dataset priors and relies on top-1 sampling, which can impede the optimization process. Herein, we enhance A-DPS by integrating a deterministic (fixed) prior-informed sampling pattern derived from the training dataset, along with group-based sampling via top-k sampling, to achieve more robust optimization—method we call Prior-aware and context-guided Group-based Active DPS (PGA-DPS).

医学影像计算可信、安全、公平与隐私论文 Subsampling Active acquisition Accelerated MRI 查看论文详情

论文ICLR 2026 Poster2026 年trustworthy medical AI

融合像素与基因：计算病理中的空间感知学习

ICLR 2026 Poster accepted paper at ICLR 2026. Recent years have witnessed remarkable progress in multimodal learning within computational pathology. Existing models primarily rely on vision and language modalities; however, language alone lacks molecular specificity and offers limited pathological supervision, leading to representational bottlenecks. In this paper, we propose STAMP, a Spatial Transcriptomics-Augmented Multimodal Pathology representation learning framework that integrates spatially-resolved gene expression profiles to enable molecule-guided joint embedding of pathology images and transcriptomic data. Our study shows that self-supervised, gene-guided training provides a robust and task-agnostic signal for learning pathology image representations. Code/project link: https://github.com/Hanminghao/STAMP

医学影像计算医疗多模态可信、安全、公平与隐私论文 Computational pathology Multimodal Learning 查看论文详情

论文ICLR 2026 Poster2026 年clinical prediction

面向少样本异常检测的双重蒸馏

ICLR 2026 Poster accepted paper at ICLR 2026. Anomaly detection is a critical task in computer vision with profound implications for medical imaging, where identifying pathologies early can directly impact patient outcomes. While recent unsupervised anomaly detection approaches show promise, they require substantial normal training data and struggle to generalize across anatomical contexts. We introduce D$^2$4FAD, a novel dual distillation framework for few-shot anomaly detection that identifies anomalies in previously unseen tasks using only a small number of normal reference images. Our approach leverages a pre-trained encoder as a teacher network to extract multi-scale features from both support and query images, while a student decoder learns to distill knowledge from the teacher on query images and self-distill on support images. Code/project link: https://github.com/ttttqz/D24FAD

医学影像计算 EHR 与临床预测论文 anomaly detection few-shot learning knowledge distillation 查看论文详情

论文ICLR 2026 Poster2026 年医学影像

Disco：通过邻接感知协同着色实现密集重叠细胞实例分割

ICLR 2026 Poster accepted paper at ICLR 2026. Accurate cell instance segmentation is foundational for digital pathology analysis. Existing methods based on contour detection and distance mapping still face significant challenges in processing complex and dense cellular regions. Graph coloring-based methods provide a new paradigm for this task, yet the effectiveness of this paradigm in real-world scenarios with dense overlaps and complex topologies has not been verified. Addressing this issue, we release a large-scale dataset GBC-FS 2025, which contains highly complex and dense sub-cellular nuclear arrangements. We conduct the first systematic analysis of the chromatic properties of cell adjacency graphs across four diverse datasets and reveal an important discovery: most real-world cell graphs are non-bipartite, with a high prevalence of odd-length cycles (predominantly triangles).

医学影像计算论文 Cell Instance Segmentation Digital Pathology Graph Coloring Topological Analysis 查看论文详情

论文ICLR 2026 Poster2026 年clinical prediction

基于多变量并行注意力生成神经元活动的基础模型

ICLR 2026 Poster accepted paper at ICLR 2026. Learning from multi-variate time-series with heterogeneous channel configurations remains a fundamental challenge for deep neural networks, particularly in clinical domains such as intracranial electroencephalography (iEEG), where channel setups vary widely across subjects. In this work, we introduce multi-variate parallel attention (MVPA), a novel self-attention mechanism that disentangles content, temporal, and spatial attention, enabling flexible, generalizable, and efficient modeling of time-series data with varying channel counts and configurations. We use MVPA to build MVPFormer, a generative foundation model for human electrophysiology, trained to predict the evolution of iEEG signals across diverse subjects. To support this and future efforts by the community, we release the SWEC iEEG dataset, the largest publicly available iEEG dataset to date, comprising nearly 10,000 hours of recordings from heterogeneous clinical sources. Code/project link: https://github.com/IBM/multi-variate-parallel-transformer; https://huggingface.co/datasets/NeuroTec/SWEC_iEEG_Dataset

EHR 与临床预测论文 time-series ieeg neurology 基础模型查看论文详情

论文ICLR 2026 Poster2026 年clinical NLP

LLM 推理中类人谬误模式的理论扎根评测

ICLR 2026 Poster accepted paper at ICLR 2026. We study logical reasoning in language models by asking whether their errors follow established human fallacy patterns. Using the Erotetic Theory of Reasoning (ETR) and its open‑source implementation, PyETR, we programmatically generate 383 formally specified reasoning problems and evaluate 38 models. For each response, we judge logical correctness and, when incorrect, whether it matches an ETR‑predicted fallacy. Two results stand out: (i) as a capability proxy (Chatbot Arena Elo) increases, a larger share of a model’s incorrect answers are ETR‑predicted fallacies ($\rho=0.360, p=0.0265$), while overall correctness on this dataset shows no correlation with capability; (ii) reversing premise order significantly reduces fallacy production for many models, mirroring human order effects.

临床语言智能论文 LLMs language models reasoning synthetic data 查看论文详情

论文ICLR 2026 Poster2026 年trustworthy medical AI

序贯信息瓶颈融合：迈向鲁棒且可泛化的多模态脑肿瘤分割

ICLR 2026 Poster accepted paper at ICLR 2026. Brain tumor segmentation in multi-modal MRIs poses significant challenges when one or more modalities are missing. Recent approaches commonly employ parallel fusion strategies; however, these methods often risk losing crucial shared information across modalities, which can degrade segmentation performance. In this paper, we advocate leveraging sequential information bottleneck fusion to effectively preserve shared information across modalities. From an information-theoretic perspective, sequential fusion not only produces more robust fused representations in missing-data scenarios but also achieves a tighter generalization upper bound compared to parallel fusion approaches.

医学影像计算医疗多模态可信、安全、公平与隐私论文 Brain Tumor Segmentation Missing Modality 查看论文详情

论文ICLR 2026 Poster2026 年医学影像

脑图基础模型：跨多图谱与疾病的预训练和提示微调

ICLR 2026 Poster accepted paper at ICLR 2026. As large language models (LLMs) continue to revolutionize AI research, there is a growing interest in building large-scale brain foundation models to advance neuroscience. While most existing brain foundation models are pre-trained on time-series signals or connectome features, we propose a novel graph-based pre-training paradigm for constructing a brain graph foundation model. In this paper, we introduce the Brain Graph Foundation Model, termed BrainGFM, a unified framework that leverages graph contrastive learning and graph masked autoencoders for large-scale fMRI-based pre-training. BrainGFM is pre-trained on a diverse mixture of brain atlases with varying parcellations, significantly expanding the pre-training corpus and enhancing the model’s ability to generalize across heterogeneous fMRI-derived brain representations. Code/project link: https://github.com/weixinxu666/BrainGFM

医学影像计算论文 Brain Graph Foundation Model Functional Magnetic Resonance Imaging (fMRI)Neuroscience Graph Pre-Training 查看论文详情

论文ICLR 2026 Poster2026 年trustworthy medical AI

UltraGauss：3D 超声体数据的超快速 Gaussian 重建

ICLR 2026 Poster accepted paper at ICLR 2026. Ultrasound imaging is widely used due to its safety, affordability, and real-time capabilities, but its 2D interpretation is highly operator-dependent, leading to variability and increased cognitive demand. We present $\textbf{UltraGauss}$: an ultrasound-specific Gaussian Splatting framework that serves as an efficient approximation to acoustic image formation. Unlike projection-based splatting, UltraGauss renders by $\textit{probe-plane intersection}$ with in-plane aggregation, aligning with plane-based echo sampling while remaining fast and memory-efficient. A stable parameterisation and compute-aware GPU rasterisation make this method practical at scale. Code/project link: https://www.robots.ox.ac.uk/~vgg/research/UltraGauss/

医学影像计算可信、安全、公平与隐私论文 Ultrasound 3D Reconstruction Gaussian 查看论文详情

论文ICLR 2026 Poster2026 年clinical prediction

基于脉冲的数字大脑：脑活动分析的新型基础模型

ICLR 2026 Poster accepted paper at ICLR 2026. Modeling the temporal dynamics of the human brain remains a core challenge in computational neuroscience and artificial intelligence. Traditional methods often ignore the biological spike characteristics of brain activity and find it difficult to reveal the dynamic dependencies and causal interactions between brain regions, limiting their effectiveness in brain function research and clinical applications. To address this issue, we propose a Spike-based Digital Brain (Spike-DB), a novel fundamental model that introduces the spike computing paradigm into brain time series modeling. Spike-DB encodes fMRI signals as spike trains and learns the temporal driving relationships between anchor and target regions to achieve high-precision prediction of brain activity and reveal underlying causal dependencies and dynamic relationship characteristics. Code/project link: https://github.com/UAIBC-Brain/Spike-DB

医学影像计算 EHR 与临床预测论文 Brain activity Fundamental model Spike computing 查看论文详情

论文ICLR 2026 Poster2026 年trustworthy medical AI

Dual-Kernel Adapter：拓展数据受限医学图像分析的空间视野

ICLR 2026 Poster accepted paper at ICLR 2026. Adapters have become a widely adopted strategy for efficient fine-tuning of foundation models, particularly in resource-constrained settings. However, their performance under extreme data scarcity—common in medical imaging due to high annotation costs, privacy regulations, and fragmented datasets—remains underexplored. In this work, we present the first comprehensive study of adapter-based fine-tuning for vision foundation models in low-data medical imaging scenarios. We find that, contrary to their promise, conventional Adapters can degrade performance under severe data constraints, performing even worse than simple linear probing when trained on less than 1\% of the corresponding training data.

医学影像计算可信、安全、公平与隐私论文 Adapter Medical Image Analysis Data-Limited Training 查看论文详情

论文ICLR 2026 Poster2026 年医学影像

统一脑表面与脑体积配准

ICLR 2026 Poster accepted paper at ICLR 2026. Accurate registration of brain MRI scans is fundamental for cross-subject analysis in neuroscientific studies. This involves aligning both the cortical surface of the brain and the interior volume. Traditional methods treat volumetric and surface-based registration separately, which often leads to inconsistencies that limit downstream analyses. We propose a deep learning framework, UCS, that registers 3D brain MRI images by jointly aligning both cortical and subcortical regions, through a unified volume-and-surface-based representation. Our approach leverages an intermediate spherical coordinate space to bridge anatomical surface topology with volumetric anatomy, enabling consistent and anatomically accurate alignment.

医学影像计算论文 neuroimaging registration sphere cortex 查看论文详情

论文ICLR 2026 Poster2026 年clinical prediction

SurvHTE-Bench：生存分析中异质治疗效应估计基准

ICLR 2026 Poster accepted paper at ICLR 2026. Estimating heterogeneous treatment effects (HTEs) from right-censored survival data is critical in high-stakes applications such as precision medicine and individualized policy-making. Yet, the survival analysis setting poses unique challenges for HTE estimation due to censoring, unobserved counterfactuals, and complex identification assumptions. Despite recent advances, from causal survival forests to survival meta-learners and outcome imputation approaches, evaluation practices remain fragmented and inconsistent. We introduce SurvHTE‐Bench, the first comprehensive benchmark for HTE estimation with censored outcomes. The benchmark spans (i) a modular suite of synthetic datasets with known ground truth, systematically varying causal assumptions and survival dynamics, (ii) semi-synthetic datasets that pair real-world covariates with simulated treatments and outcomes, and (iii) real-world datasets from a twin study (with known ground truth) and from an HIV clinical trial.

EHR 与临床预测论文 Causal Inference Survival Analysis Treatment Effect Datasets and Benchmarks 查看论文详情

论文ICLR 2026 Poster2026 年trustworthy medical AI

SE-Diff：面向综合 ECG 生成的模拟器与经验增强扩散模型

ICLR 2026 Poster accepted paper at ICLR 2026. Cardiovascular disease (CVD) is a leading cause of mortality worldwide. Electrocardiograms (ECGs) are the most widely used non-invasive tool for cardiac assessment, yet large, well-annotated ECG corpora are scarce due to cost, privacy, and workflow constraints. Generating ECGs can aid mechanistic understanding of cardiac electrical activity, enable the construction of large, heterogeneous, and unbiased datasets, and facilitate privacy-preserving data sharing. Generating realistic ECG signals from clinical context is important yet underexplored. Recent work has leveraged diffusion models for text-to-ECG generation, but two challenges remain: (i) existing methods often overlook physiological simulator knowledge of cardiac activity; and (ii) they ignore broader, experience-based clinical knowledge grounded in real-world practice.

医学影像计算临床语言智能 EHR 与临床预测论文 Diffusion Model ECG 心电查看论文详情

论文ICLR 2026 Poster2026 年clinical prediction

MRI 运动校正的可靠评测：数据集与洞见

ICLR 2026 Poster accepted paper at ICLR 2026. Correcting motion artifacts in scientific and medical imaging is important, as they significantly impact image quality. However, evaluating deep learning-based and classical motion correction methods remains fundamentally difficult due to the lack of accessible ground-truth target data. To address this challenge, we study three evaluation approaches: real-world evaluation based on reference scans, simulated motion, and reference-free evaluation, each with its merits and shortcomings. To enable evaluation with real-world motion artifacts, we release PMoC3D, a dataset consisting of unprocessed $\textbf{P}$aired $\textbf{Mo}$tion-$\textbf{C}$orrupted $\textbf{3D}$ brain MRI data.

医学影像计算 EHR 与临床预测论文 3D MRI motion correction Accelerated MRI Dataset 查看论文详情

论文ICLR 2026 Poster2026 年trustworthy medical AI

ProstaTD：将手术 triplet 从分类桥接到全监督检测

ICLR 2026 Poster accepted paper at ICLR 2026. Surgical triplet detection is a critical task in surgical video analysis, with significant implications for performance assessment and training novice surgeons. However, existing datasets like CholecT50 lack precise spatial bounding box annotations, rendering triplet classification at the image level insufficient for practical applications. The inclusion of bounding box annotations is essential to make this task meaningful, as they provide the spatial context necessary for accurate analysis and improved model generalizability. To address these shortcomings, we introduce ProstaTD, a large-scale, multi-institutional dataset for surgical triplet detection, developed from the technically demanding domain of robot-assisted prostatectomy.

医学影像计算可信、安全、公平与隐私论文 Surgical Triplet Endoscopy 检测查看论文详情

论文ICLR 2026 Poster2026 年trustworthy medical AI

NAB：稀疏视角 CT 重建的神经自适应分箱

ICLR 2026 Poster accepted paper at ICLR 2026. Computed Tomography (CT) plays a vital role in inspecting the internal structures of industrial objects. Furthermore, achieving high-quality CT reconstruction from sparse views is essential for reducing production costs. While classic implicit neural networks have shown promising results for sparse reconstruction, they are unable to leverage shape priors of objects. Motivated by the observation that numerous industrial objects exhibit rectangular structures, we propose a novel \textbf{N}eural \textbf{A}daptive \textbf{B}inning (\textbf{NAB}) method that effectively integrates rectangular priors into the reconstruction process. Code/project link: https://github.com/Wangduo-Xie/NAB_CT_reconstruction

医学影像计算可信、安全、公平与隐私论文 Binning Rotation Reconstruction 查看论文详情

论文ICLR 2026 Poster2026 年clinical prediction

MedAraBench：大规模阿拉伯语医学问答数据集与基准

ICLR 2026 Poster accepted paper at ICLR 2026. Arabic remains one of the most underrepresented languages in natural language processing research, particularly in medical applications, due to the limited availability of open-source data and benchmarks. The lack of resources hinders efforts to evaluate and advance the multilingual capabilities of Large Language Models (LLMs). In this paper, we introduce MedAraBench, a large-scale dataset consisting of Arabic multiple-choice question-answer pairs across various medical specialties. We constructed the dataset by manually digitizing a large repository of academic materials created by medical professionals in the Arabic-speaking region.

医学影像计算临床语言智能 EHR 与临床预测论文 Dataset Benchmark Large Language Models 查看论文详情

论文ICLR 2026 Poster2026 年trustworthy medical AI

基于持续 Fiedler 向量图模型的医疗保险欺诈检测

ICLR 2026 Poster accepted paper at ICLR 2026. Healthcare insurance fraud detection presents unique machine learning challenges: labeled data are scarce due to delayed verification processes, and fraudulent behaviors evolve rapidly, often manifesting in complex, graph-structured interactions. Existing methods struggle in such settings. Pretraining routines typically overlook structural anomalies under limited supervision, while online models often fail to adapt to changing fraud patterns without labeled updates. To address these issues, we propose the Continual Fiedler Vector Graph model (ConFVG), a fraud detection framework designed for label-scarce and non-stationary environments.

医学影像计算可信、安全、公平与隐私论文 online learning semi-supervised fraud detection 查看论文详情

论文ICLR 2026 Poster2026 年trustworthy medical AI

通过上下文-细节交互自适应门增强医疗时间序列稀疏事件检测

ICLR 2026 Poster accepted paper at ICLR 2026. Accurate detection of clinically meaningful events in healthcare time-series data is crucial for reliable downstream analysis and decision support. However, most existing methods struggle to jointly localize event boundaries and classify event types; even detection transformer (DETR)-based approaches show limited performance when confronted with extremely sparse events typical of clinical recordings. To address these challenges, we propose a coarse-to-fine detection framework combining a global context explorer, a local detail inspector, and an adaptive gating module (AGM) that fuses multiple label perspectives. The AGM uses transformed labels—encoding event presence and temporal position—to improve learning on sparse events.

医学影像计算 EHR 与临床预测可信、安全、公平与隐私论文 Event detection Time series analysis 查看论文详情

论文ICLR 2026 Poster2026 年clinical prediction

DM4CT：计算机断层重建扩散模型基准

ICLR 2026 Poster accepted paper at ICLR 2026. Diffusion models have recently emerged as powerful priors for solving inverse problems. While Computed Tomography (CT) is theoretically a linear inverse problem, it poses many practical challenges. These include correlated noise, artifact structures, reliance on system geometry, and misaligned value ranges, which make the direct application of diffusion models more difficult than in domains like natural image generation. To systematically evaluate how diffusion models perform in this context and compare them with established reconstruction methods, we introduce DM4CT, a comprehensive benchmark for CT reconstruction. Code/project link: https://github.com/DM4CT/DM4CT

医学影像计算 EHR 与临床预测论文 benchmark dataset inverse problem 查看论文详情

论文ICLR 2026 Poster2026 年clinical prediction

拼合心智马赛克：迈向 EEG 语义意图解码

ICLR 2026 Poster accepted paper at ICLR 2026. Enabling natural communication through brain–computer interfaces (BCIs) remains one of the most profound challenges in neuroscience and neurotechnology. While existing frameworks offer partial solutions, they are constrained by oversimplified semantic representations and a lack of interpretability. To overcome these limitations, we introduce **Semantic Intent Decoding(SID)**, a novel framework that translates neural activity into natural language by modeling meaning as a flexible set of compositional semantic units. SID is built on three core principles: semantic compositionality, continuity and expandability of semantic space, and fidelity in reconstruction.

医学影像计算 EHR 与临床预测论文 Electroencephalography (EEG)Brain-computer interface (BCI)Semantic Intent 查看论文详情

论文ICLR 2026 Poster2026 年trustworthy medical AI

用谱熵正则重新思考医学图像分割中的模型校准

ICLR 2026 Poster accepted paper at ICLR 2026. Deep neural networks for medical image segmentation often produce overconfident predictions, posing clinical risks due to miscalibrated uncertainty estimates. In this work, we rethink model calibration from a frequency-domain perspective and identify two critical factors causing miscalibration: spectral bias, where models overemphasize low-frequency components, and confidence saturation, which suppresses overall power spectral density in confidence maps. To address these challenges, we propose a novel frequency-aware calibration framework integrating spectral entropy regularization and power spectral smoothing. The spectral entropy term promotes a balanced frequency spectrum and enhances overall spectral power, enabling better modeling of high-frequency boundary and low-frequency structural uncertainty.

医学影像计算可信、安全、公平与隐私论文 medical image segmentation model calibration spectral entropy 查看论文详情

论文ICLR 2026 Poster2026 年clinical prediction

医学 MLLM 如何失效？医学图像视觉定位研究

ICLR 2026 Poster accepted paper at ICLR 2026. Generalist multimodal large language models (MLLMs) have achieved impressive performance across a wide range of vision-language tasks. However, their performance on medical tasks—particularly in zero-shot settings where generalization is critical—remains suboptimal. A key research gap is the limited understanding of why medical MLLMs underperform in medical image interpretation. **In this work**, we present a pioneering systematic investigation into the visual grounding capabilities of state-of-the-art medical MLLMs. To disentangle *visual grounding* from *semantic grounding*, we design VGMED, a novel evaluation dataset developed with expert clinical guidance, explicitly assessing the visual grounding capability of medical MLLMs. Code/project link: https://guimeng-leo-liu.github.io/Medical-MLLMs-Fail/

医学影像计算医疗多模态临床语言智能论文 Medical MLLM Visual Grounding 查看论文详情

论文ICLR 2026 Poster2026 年surgical/interventional AI

WavePolyp：基于层级小波特征聚合与帧间差异感知的视频息肉分割

ICLR 2026 Poster accepted paper at ICLR 2026. Automatic polyp segmentation from colonoscopy videos is a crucial technique that assists clinicians in improving the accuracy and efficiency of diagnosis, preventing polyps from developing into cancer. However, video polyp segmentation (VPS) is a challenging task due to (1) the significant inter-frame divergence in videos, (2) the high camouflage of polyps in normal colon structures and (3) the clinical requirement of real-time performance. In this paper, we propose a novel segmentation network, WavePolyp, which consists of two innovative components: a hierarchical wavelet-based feature aggregation (HWFA) module and inter-frame divergence perception (IDP) blocks. Specifically, HWFA excavates and amplifies discriminative information from high-frequency and low-frequency features decomposed by wavelet transform, hierarchically aggregating them into refined spatial representations within each frame. Code/project link: https://github.com/FishballZhang/WavePolyp

医学影像计算论文 Video Polyp Segmentation ICLR 2026 ICLR 2026 Poster surgical_intervention 查看论文详情

论文ICLR 2026 Poster2026 年clinical NLP

迈向医学图像分割中的文本-掩膜一致性

ICLR 2026 Poster accepted paper at ICLR 2026. Vision-language models for medical image segmentation often produce masks that conflict with the accompanying text, especially under multi-site/multi-lesion descriptions. We trace this failure to two factors: (i) highly templated and repetitive clinical language causes one-to-one hard contrastive learning to yield numerous false negatives, weakening cross-modal alignment; and (ii) predominantly vision-driven, one-way cross-attention lacks a language-dominant, spatially aware pathway, hindering effective injection of textual semantics into the spatial visual domain. To this end, we propose Consistency-enhanced Two-stage Segmentation (C2Seg). In the pretraining stage, Cluster-aware Contrastive Learning uses a frozen strong baseline to construct an intra-batch text similarity matrix as soft labels, thereby alleviating false negative conflicts and producing more discriminative visual representations.

医学影像计算医疗多模态临床语言智能论文 Medical image segmentation Vision language models 查看论文详情

论文ICLR 2026 Poster2026 年trustworthy medical AI

用时频 motif 学习对单通道 EEG 进行 token 化

ICLR 2026 Poster accepted paper at ICLR 2026. Foundation models are reshaping EEG analysis, yet an important problem of EEG tokenization remains a challenge. This paper presents TFM-Tokenizer, a novel tokenization framework that learns a vocabulary of time-frequency motifs from *single-channel* EEG signals and encodes them into discrete tokens. We propose a dual-path architecture with time–frequency masking to capture robust motif representations, and it is model-agnostic, supporting both lightweight transformers and existing foundation models for downstream tasks. Our study demonstrates three key benefits: *Accuracy:* Experiments on four diverse EEG benchmarks demonstrate consistent performance gains across both single- and multi-dataset pretraining settings, achieving up to $11\%$ improvement in Cohen’s Kappa over strong baselines. Code/project link: https://github.com/Jathurshan0330/TFM-Tokenizer

医学影像计算 EHR 与临床预测可信、安全、公平与隐私论文 EEG Tokenization 查看论文详情

论文ICLR 2026 Poster2026 年trustworthy medical AI

随机锚点与低秩去相关学习：类增量医学图像分类的极简流程

ICLR 2026 Poster accepted paper at ICLR 2026. Class-incremental learning (CIL) in medical image-guided diagnosis requires models to preserve knowledge of historical disease classes while adapting to emerging categories. Pre-trained models (PTMs) with well-generalized features provide a strong foundation, yet most PTM-based CIL strategies, such as prompt tuning, task-specific adapters and model mixtures, rely on increasingly complex designs. While effective in general-domain benchmarks, these methods falter in medical imaging, where low intra-class variability and high inter-domain shifts (from scanners, protocols and institutions) make CIL particularly prone to representation collapse and domain misalignment. Under such conditions, we find that lightweight representation calibration strategies, often dismissed in general-domain CIL for their modest gains, can be remarkably effective for adapting PTMs in medical settings.

医学影像计算 EHR 与临床预测可信、安全、公平与隐私论文 Medical Image Classification Feature Calibration 查看论文详情

论文ICLR 2026 Poster2026 年trustworthy medical AI

PathChat-SegR1：通过 SO-GRPO 实现病理推理分割

ICLR 2026 Poster accepted paper at ICLR 2026. Segmentation in pathology image requires handling out-of-domain tissue morphologies and new pathologies beyond training distributions, where traditional closed-set segmentation approaches fail to generalize. Reasoning segmentation enables zero-shot generalization via prompting with text queries. However, existing reasoning segmentation models face three barriers when applied to pathology: (1) the vision encoder lack pathology-specific knowledge and robustness to staining variations, (2) the large language model (LLM) backbone for reasoning fails to identify whether it has gathered sufficient semantic context to trigger the segmentation output, and (3) no reasoning segmentation benchmarks and datasets exist for pathology analysis. Consequently, we introduce PathChat-SegR1, a reasoning segmentation model built upon pathology-specific vision encoders trained with a novel stain-invariant self-distillation for robust pathology image representations.

医学影像计算临床语言智能论文 Clinical Reasoning Reinforcement Learning Reasoning Segmentation 查看论文详情

论文ICLR 2026 Poster2026 年trustworthy medical AI

基于强化学习的假设驱动临床决策语言 Agent

ICLR 2026 Poster accepted paper at ICLR 2026. Clinical decision-making is a dynamic, interactive, and cyclic process where doctors have to repeatedly decide on which clinical action to perform and consider newly uncovered information for diagnosis and treatment. Large Language Models (LLMs) have the potential to support clinicians in this process, however, most applications of LLMs in clinical decision support suffer from one of two limitations: Either they assume the unrealistic scenario of immediate availability of all patient information and do not model the interactive and iterative investigation process, or they restrict themselves to the limited "out-of-the-box" capabilities of large pre-trained models without performing task-specific training. In contrast to this, we propose to model clinical decision-making for diagnosis with a hypothesis-driven uncertainty-aware language agent, LA-CDM, that converges towards a diagnosis via repeatedly requesting and interpreting relevant tests. Using a hybrid training paradigm combining supervised and reinforcement learning, we train LA-CDM with three objectives targeting critical aspects of clinical decision-making: accurate hypothesis generation, hypothesis uncertainty estimation, and efficient decision-making. Code/project link: https://github.com/dharouni/LA-CDM

医学影像计算临床语言智能可信、安全、公平与隐私论文 Clinical Decision Making Large Language Models 查看论文详情

论文ICLR 2026 Poster2026 年trustworthy medical AI

单模态基础模型的联合适配用于多模态阿尔茨海默病诊断

ICLR 2026 Poster accepted paper at ICLR 2026. Alzheimer’s Disease (AD) is a progressive neurodegenerative disorder and a leading cause of dementia worldwide. Accurate diagnosis requires integrating diverse patient data modalities. With the rapid advancement of foundation models in neurobiology and medicine, integrating foundation models from various modalities has emerged as a promising yet underexplored direction for multi-modal AD diagnosis. A central challenge is enabling effective interaction among these models without disrupting the robust, modality-specific representations learned from large-scale pretraining. To address this, we propose a novel multi-modal framework for AD diagnosis that enables joint interaction among uni-modal foundation models through modality-anchored interaction.

医学影像计算医疗多模态可信、安全、公平与隐私论文 Artificial Intelligence for sciences Alzheimer's disease 查看论文详情

论文ICLR 2026 Poster2026 年trustworthy medical AI

GARLIC：ICU 多变量时间序列的图注意力关系学习

ICLR 2026 Poster accepted paper at ICLR 2026. Healthcare data, such as Intensive Care Unit (ICU) records, comprise heterogeneous multivariate time series sampled at irregular intervals with pervasive missingness. However, clinical applications demand predictive models that are both accurate and interpretable. We present our Graph Attention-based Relational Learning for Intensive Care (GARLIC) model, a novel neural network architecture that imputes missing data through a learnable exponential-decay encoder, captures inter-sensor dependencies via time-lagged summary graphs, and fuses global patterns with cross-dimensional sequential attention. All attention weights and graph edges are learned end-to-end to serve as built-in observation-, signal-, and edge-level explanations.

医学影像计算 EHR 与临床预测可信、安全、公平与隐私论文 irregular multivariate time series graph neural network 查看论文详情

论文ICLR 2026 Poster2026 年trustworthy medical AI

Critic-Adviser-Reviser 循环精炼：迈向高质量 EMR 语料生成

ICLR 2026 Poster accepted paper at ICLR 2026. Electronic medical records (EMRs) are vital for healthcare research, but their use is limited by privacy concerns. Synthetic EMR generation offers a promising alternative, yet most existing methods merely imitate real records without adhering to rigorous clinical quality principles. To address this, we introduce LLM-CARe, a stage-wise cyclic refinement framework that progressively improves EMR quality through three stages, each targeting a specific granularity: corpus, section and document. At each stage, a Critic, an Adviser, and a Reviser collaborate iteratively to evaluate, provide feedback, and refine the drafts.

医学影像计算临床语言智能 EHR 与临床预测论文 Large Language Model Synthetic Data Generation 查看论文详情

论文ICLR 2026 Poster2026 年surgical/interventional AI

生物与临床轨迹的可控序列编辑

ICLR 2026 Poster accepted paper at ICLR 2026. Conditional generation models for longitudinal sequences can produce new or modified trajectories given a conditioning input. However, they often lack control over when the condition should take effect (timing) and which variables it should influence (scope). Most methods either operate only on univariate sequences or assume that the condition alters all variables and time steps. In scientific and clinical settings, interventions instead begin at a specific moment, such as the time of drug administration or surgery, and influence only a subset of measurements while the rest of the trajectory remains unchanged.

医学影像计算 EHR 与临床预测论文 conditional generation sequence editing time series forecasting 查看论文详情

论文ICLR 2026 Poster2026 年clinical NLP

通过多粒度语言学习增强医学视觉理解

ICLR 2026 Poster accepted paper at ICLR 2026. Recent advances in image-text pretraining have significantly enhanced visual understanding by aligning visual and textual representations. Contrastive Language-Image Pretraining (CLIP) has played a pivotal role in multimodal learning. However, its focus on single-label, single-granularity alignment limits its effectiveness in complex domains such as medical imaging, where images often correspond to multiple labels across different levels of granularity. To address this, we propose Multi-Granular Language Learning (MGLL), a contrastive learning framework designed to improve both multi-label and cross-granularity alignment. Code/project link: https://github.com/HUANGLIZI/MGLL

医学影像计算医疗多模态临床语言智能论文 Multi-Granular Language Learning Medical Image Analysis 查看论文详情

论文ICLR 2026 Oral2026 年clinical prediction

BioX-Bridge：跨生物信号的无监督跨模态知识迁移模型桥接

ICLR 2026 Oral accepted paper at ICLR 2026. Biosignals offer valuable insights into the physiological states of the human body. Although biosignal modalities differ in functionality, signal fidelity, sensor comfort, and cost, they are often intercorrelated, reflecting the holistic and interconnected nature of human physiology. This opens up the possibility of performing the same tasks using alternative biosignal modalities, thereby improving the accessibility, usability, and adaptability of health monitoring systems. However, the limited availability of large labeled datasets presents challenges for training models tailored to specific tasks and modalities of interest.

医学影像计算 EHR 与临床预测论文 biosignal ai for healthcare humans and ai 查看论文详情

论文ICLR 2026 Poster2026 年clinical prediction

重用基础模型实现可泛化医学时间序列分类

ICLR 2026 Poster accepted paper at ICLR 2026. Medical time series (MedTS) classification suffers from poor generalizability in real-world deployment due to inter- and intra-dataset heterogeneity, such as varying numbers of channels, signal lengths, task definitions, and patient characteristics. % implicit patient characteristics, variable channel configurations, time series lengths, and diagnostic tasks. To address this, we propose FORMED, a novel framework for repurposing a backbone foundation model, pre-trained on generic time series, to enable highly generalizable MedTS classification on unseen datasets. FORMED combines the backbone with a novel classifier comprising two components: (1) task-specific channel embeddings and label queries, dynamically sized to match any number of channels and target classes, and (2) a shared decoding attention layer, jointly trained across datasets to capture medical domain knowledge through task-agnostic feature-query interactions.

医学影像计算 EHR 与临床预测论文 Medical Time Seris 分类 Time Series Foundation Model 查看论文详情

论文ICLR 2026 Poster2026 年trustworthy medical AI

Cross-Timestep：用于医学分割的跨时序记忆 LSTM 与自适应先验解码 3D 扩散模型

ICLR 2026 Poster accepted paper at ICLR 2026. Diffusion models have recently demonstrated significant robustness in medical image segmentation, effectively accommodating variations across different imaging styles. However, their applications remain limited due to: (i) current successes being primarily confined to 2D segmentation tasks—we observe that diffusion models tend to collapse at the early stage when applied to 3D medical tasks; and (ii) the inherently isolated iteration along timesteps during training and inference. To tackle these limitations, we propose a novel framework named Cross-Timestep, which incorporates two key innovations: an Adaptive Priori Decoding Strategy (APDS) and a trans-temporal memory LSTM (tLSTM) mechanism. (i) The APDS provides prior guidance during the diffusion process by employing a Priori Decoder(PD) that focuses solely on the conditional branch, successfully stabilizing the reverse diffusion process.

医学影像计算可信、安全、公平与隐私论文 Diffusion Models Medical Image Segmentation LSTM 查看论文详情

论文ICLR 2026 Oral2026 年clinical prediction

CounselBench：心理健康问答中大语言模型的大规模专家评测与对抗基准

ICLR 2026 Oral accepted paper at ICLR 2026. Medical question answering (QA) benchmarks often focus on multiple-choice or fact-based tasks, leaving open-ended answers to real patient questions underexplored. This gap is particularly critical in mental health, where patient questions often mix symptoms, treatment concerns, and emotional needs, requiring answers that balance clinical caution with contextual sensitivity. We present CounselBench, a large-scale benchmark developed with 100 mental health professionals to evaluate and stress-test large language models (LLMs) in realistic help-seeking scenarios. The first component, CounselBench-EVAL, contains 2,000 expert evaluations of answers from GPT-4, LLaMA 3, Gemini, and online human therapists on patient questions from the public forum CounselChat.

医学影像计算临床语言智能 EHR 与临床预测论文 large language models mental health 查看论文详情

论文ICLR 2026 Poster2026 年clinical prediction

CerebraGloss：面向细粒度临床 EEG 解读的大型视觉语言模型指令微调

ICLR 2026 Poster accepted paper at ICLR 2026. Interpreting clinical electroencephalography (EEG) is a laborious, subjective process, and existing computational models are limited to narrow classification tasks rather than holistic interpretation. A key bottleneck for applying powerful Large Vision-Language Models (LVLMs) to this domain is the scarcity of datasets pairing EEG visualizations with fine-grained, expert-level annotations. We address this by introducing CerebraGloss, an instruction-tuned LVLM for nuanced EEG interpretation. We first introduce a novel, automated data generation pipeline, featuring a bespoke YOLO-based waveform detector, to programmatically create a large-scale corpus of EEG-text instruction data. Code/project link: https://github.com/iewug/CerebraGloss

医学影像计算医疗多模态临床语言智能论文 large vision-language model instruction-tuning 查看论文详情

论文ICLR 2026 Poster2026 年clinical prediction

CRONOS：4D 医学纵向序列的连续时间重建

ICLR 2026 Poster accepted paper at ICLR 2026. Forecasting how 3D medical scans evolve along time is important for disease progression, treatment planning, and developmental assessment. Yet existing models either rely on a single prior scan, fixed grid times, or target global labels, which limits voxel-level forecasting under irregular sampling. We present CRONOS, a unified framework for many-to-one prediction from multiple past scans that supports both discrete (grid-based) and continuous (real-valued) timestamps in one model, to the best of our knowledge the first to achieve continuous sequence-to-image forecasting for 3D medical data. CRONOS learns a spatio-temporal velocity field that transports context volumes toward a target volume at an arbitrary time, while operating directly in 3D voxel space.

医学影像计算 EHR 与临床预测论文 Medical Imaging Flow Matching Longitudinal Spatio-Temporal 查看论文详情

论文ICLR 2026 Poster2026 年medical LLM agent

AnesSuite：面向 LLM 麻醉学推理的综合基准与数据集套件

ICLR 2026 Poster accepted paper at ICLR 2026. The application of large language models (LLMs) in the medical field has garnered significant attention, yet their reasoning capabilities in more specialized domains like anesthesiology remain underexplored. To bridge this gap, we introduce AnesSuite, the first comprehensive dataset suite specifically designed for anesthesiology reasoning in LLMs. The suite features AnesBench, an evaluation benchmark tailored to assess anesthesiology-related reasoning across three levels: factual retrieval (System 1), hybrid reasoning (System 1.x), and complex decision-making (System 2). Alongside this benchmark, the suite includes three training datasets that provide an infrastructure for continued pre-training (CPT), supervised fine-tuning (SFT), and reinforcement learning with verifiable rewards (RLVR). Code/project link: https://github.com/MiliLab/AnesSuite

医学影像计算临床语言智能论文 Large language model Reasoning Anesthesiology 查看论文详情

论文ICLR 2026 Poster2026 年医学影像

MedGMAE：面向医学体数据表征学习的 Gaussian 掩码自编码器

ICLR 2026 Poster accepted paper at ICLR 2026. Self-supervised pre-training has emerged as a critical paradigm for learning transferable representations from unlabeled medical volumetric data. Masked autoencoder based methods have garnered significant attention, yet their application to volumetric medical image faces fundamental limitations from the discrete voxel-level reconstruction objective, which neglects comprehensive anatomical structure continuity. To address this challenge, We propose MedGMAE, a novel framework that replaces traditional voxel reconstruction with 3D Gaussian primitives reconstruction as new perspectives on representation learning. Our approach learns to predict complete sets of 3D Gaussian parameters as semantic abstractions to represent the entire 3D volume, from sparse visible image patches. Code/project link: https://github.com/windrise/MedGMAE; https://anonymous.4open.science/r/MedGMAE-EC8F/

医学影像计算论文 3D Gaussian Representation Medical Imaging analysis Volumetric Representation Learning ICLR 2026 查看论文详情

论文ICLR 2026 Poster2026 年surgical/interventional AI

HFSTI-Net：视频息肉分割的层级频率-空间-时间交互

ICLR 2026 Poster accepted paper at ICLR 2026. Automatic video polyp segmentation (VPS) is crucial for preventing and treating colorectal cancer by ensuring accurate identification of polyps in colonoscopy examinations. However, its clinical application is hampered by two key challenges: shape collapse, which compromises structural integrity, and episodic amnesia, which causes instability in challenging video sequences. To address these challenges, we present a novel video segmentation network, \emph{HFSTI-Net}, which integrates global perception with spatiotemporal consistency in spatial, temporal, and frequency domains. Specifically, to address shape collapse under low contrast or visual ambiguity, we design a Hierarchical Frequency-spatial Interaction (HFSI) module that fuses spatial and frequency cues for fine-grained boundary localization. Code/project link: https://github.com/Yuanqin-He/HFSTI-Net

医学影像计算论文 Frequency Learning Video Segmentation Medical Segmentation Video Polyp Segmentation 查看论文详情

论文ICLR 2026 Poster2026 年clinical prediction

泛癌筛查中的扫视-聚焦强化机制

ICLR 2026 Poster accepted paper at ICLR 2026. Pan-cancer screening in large-scale CT scans remains challenging for existing AI methods, primarily due to the difficulty of localizing diverse types of tiny lesions in large CT volumes. The extreme foreground-background imbalance significantly hinders models from focusing on diseased regions, while redundant focus on healthy regions not only decreases the efficiency but also increases false positives. Inspired by radiologists' glance and focus diagnostic strategy, we introduce GF-Screen, a Glance and Focus reinforcement learning framework for pan-cancer screening. GF-Screen employs a Glance model to localize the diseased regions and a Focus model to precisely segment the lesions, where segmentation results of the Focus model are leveraged to reward the Glance model via Reinforcement Learning (RL). Code/project link: https://github.com/Luffy03/GF-Screen

医学影像计算 EHR 与临床预测论文 Pan-cancer screening AI for healthcare ICLR 2026 查看论文详情

论文ICLR 2026 Poster2026 年trustworthy medical AI

超越聚合：在异质联邦学习中引导客户端

ICLR 2026 Poster accepted paper at ICLR 2026. Federated learning (FL) is increasingly adopted in domains like healthcare, where data privacy is paramount. A fundamental challenge in these systems is statistical heterogeneity—the fact that data distributions vary significantly across clients (e.g., different hospitals may treat distinct patient demographics). While current FL algorithms focus on aggregating model updates from these heterogeneous clients, the potential of the central server remains under-explored. This paper is motivated by a healthcare scenario: could a central server not only coordinate model training but also guide a new patient to the hospital best equipped for their specific condition?

医学影像计算可信、安全、公平与隐私论文 client allocation density ratio model empirical likelihood 查看论文详情

论文ICLR 2026 Poster2026 年trustworthy medical AI

ECG 基础模型基准：跨临床任务的现实检验

ICLR 2026 Poster accepted paper at ICLR 2026. The 12-lead electrocardiogram (ECG) is a long-standing diagnostic tool. Yet machine learning for ECG interpretation remains fragmented, often limited to narrow tasks or datasets. FMs promise broader adaptability, but fundamental questions remain: Which architectures generalize best? How do models scale with limited labels? What explains performance differences across model families? We benchmarked eight ECG FMs on 26 clinically relevant tasks using 12 public datasets comprising 1,650 regression and classification targets. Models were evaluated under fine-tuning and frozen settings, with scaling analyses across dataset sizes.

医学影像计算 EHR 与临床预测可信、安全、公平与隐私论文 ECG 心电基础模型查看论文详情

论文ICLR 2026 Poster2026 年clinical NLP

用于胸部 X 光图像的结构化、标注式、定位化 VQA 数据集：含完整句答案与场景图

ICLR 2026 Poster accepted paper at ICLR 2026. Visual Question Answering (VQA) enables targeted and context-dependent analysis of medical images, such as chest X-rays (CXRs). However, existing VQA datasets for CXRs are typically constrained by simplistic and brief answer formats, lacking localization annotations (e.g., bounding boxes) and structured tags (e.g., region or radiological finding/disease tags). To address these limitations, we introduce MIMIC-Ext-CXR-QBA (abbr. CXR-QBA), a large-scale CXR VQA dataset derived from MIMIC-CXR, comprising 42 million QA-pairs with multi-granular, multi-part answers, detailed bounding boxes, and structured tags. Code/project link: https://github.com/philip-mueller/mimic-ext-cxr-qba/

医学影像计算医疗多模态临床语言智能论文 VQA Localization 查看论文详情

论文ICLR 2026 Poster2026 年clinical NLP

重新思考放射报告生成：从叙事流到主题引导 findings

ICLR 2026 Poster accepted paper at ICLR 2026. Vision-Language Models (VLMs) for radiology report generation are typically trained to mimic the narrative flow of human experts. However, we identify a potential limitation in this conventional paradigm. We hypothesize that optimizing for narrative coherence encourages models to rely on linguistic priors and inter-sentence correlations, which can weaken their grounding in direct visual evidence and lead to factual inaccuracies. To investigate this, we design a controlled experiment demonstrating that as textual context increases, a model's reliance on the input image systematically decays. We propose LLaVA-TA (Topic-guided and Anatomy-aware), a new fine-tuning framework that directly addresses this challenge by re-engineering the generation process.

医学影像计算医疗多模态临床语言智能论文 Radiology report generation large-language models 查看论文详情

论文ICLR 2026 Poster2026 年clinical prediction

M3CoTBench：医学图像理解中 MLLM 思维链基准

ICLR 2026 Poster accepted paper at ICLR 2026. Chain-of-Thought (CoT) reasoning has proven effective in enhancing large language models by encouraging step-by-step intermediate reasoning, and recent advances have extended this paradigm to Multimodal Large Language Models (MLLMs). In the medical domain, where diagnostic decisions depend on nuanced visual cues and sequential reasoning, CoT aligns naturally with clinical thinking processes. However, current benchmarks for medical image understanding generally focus on the final answer while ignoring the reasoning path. An opaque process lacks reliable bases for judgment, making it difficult to assist doctors in diagnosis.

医学影像计算医疗多模态临床语言智能论文 Chain-of-Thought Multimodal Large Language Models 查看论文详情

论文ICLR 2026 Poster2026 年medical LLM agent

K-Prism：知识引导与提示融合的通用医学图像分割模型

ICLR 2026 Poster accepted paper at ICLR 2026. Medical image segmentation is fundamental to clinical decision-making, yet existing models remain fragmented. They are usually trained on single knowledge sources and specific to individual tasks, modalities, or organs. This fragmentation contrasts sharply with clinical practice, where experts seamlessly integrate diverse knowledge: anatomical priors from training, exemplar-based reasoning from reference cases, and iterative refinement through real-time interaction. We present $\textbf{K-Prism}$, a unified segmentation framework that mirrors this clinical flexibility by systematically integrating three knowledge paradigms: (i) $\textit{semantic priors}$ learned from annotated datasets, (ii) $\textit{in-context knowledge}$ from few-shot reference examples, and (iii) $\textit{interactive feedback}$ from user inputs like clicks or scribbles. Code/project link: https://github.com/bangwayne/K-Prism

医学影像计算论文 Medical Image Image Segmentation Universal Model Prompt Integration 查看论文详情

论文ICLR 2026 Poster2026 年clinical prediction

面向数据高效精准肿瘤学的病理组学多模态结构表征学习

ICLR 2026 Poster accepted paper at ICLR 2026. Fusing histopathology images and genomics data with deep learning has significantly advanced precision oncology. However, genomics data is often missing due to its high acquisition cost and complexity in real-world clinical scenarios. Existing solutions aim to reconstruct genomics data from histopathology images. Nevertheless, these methods typically relied only on individual case and overlooked the potential relationships among cases. Additionally, they failed to take advantage of the authentic genomics data of diagnostically related cases that are accessible from training for inference. In this work, we propose a novel Multi-modal Structural Representation Learning (MSRL) framework for data-efficient precision oncology. Code/project link: https://github.com/WkEEn/MSRL

医学影像计算医疗多模态 EHR 与临床预测论文 multi-modal learning histopathology image representation learning 查看论文详情

论文ICLR 2026 Poster2026 年clinical prediction

Pixel-Level Residual Diffusion Transformer：可扩展 3D CT 体数据生成

ICLR 2026 Poster accepted paper at ICLR 2026. Generating high-resolution 3D CT volumes with fine details remains challenging due to substantial computational demands and optimization difficulties inherent to existing generative models. In this paper, we propose the Pixel-Level Residual Diffusion Transformer (PRDiT), a scalable generative framework that synthesizes high-quality 3D medical volumes directly at voxel-level. PRDiT introduces a two-stage training architecture comprising 1) a local denoiser in the form of an MLP-based blind estimator operating on overlapping 3D patches to separate low-frequency structures efficiently, and 2) a global residual diffusion transformer employing memory-efficient attention to model and refine high-frequency residuals across entire volumes. This coarse-to-fine modeling strategy simplifies optimization, enhances training stability, and effectively preserves subtle structures without the limitations of an autoencoder bottleneck.

医学影像计算 EHR 与临床预测论文 Medical Imaging 3D Diffusion Model Diffusion Transformer 查看论文详情

论文ICLR 2026 Poster2026 年医学影像

建模像素级自监督嵌入密度用于医学 CT 无监督病理分割

ICLR 2026 Poster accepted paper at ICLR 2026. Accurate detection of all pathological findings in 3D medical images remains a significant challenge, as supervised models are limited to detecting only the few pathology classes annotated in existing datasets. To address this, we frame pathology detection as an unsupervised visual anomaly segmentation (UVAS) problem, leveraging the inherent rarity of pathological patterns compared to healthy ones. We enhance the existing density-based UVAS framework with two key innovations: (1) dense self-supervised learning for feature extraction, eliminating the need for supervised pretraining, and (2) learned, masking-invariant dense features as conditioning variables, replacing hand-crafted positional encodings. Trained on over 30,000 unlabeled 3D CT volumes, our fully self-supervised model, Screener, outperforms existing UVAS methods on four large-scale test datasets comprising 1,820 scans with diverse pathologies. Code/project link: https://github.com/mishgon/screener; https://anonymous.4open.science/r/screener-35EE/

医学影像计算论文 Unsupervised Visual Anomaly Segmentation Self-supervised learning Density estimation Computed Tomography 查看论文详情

论文ICLR 2026 Poster2026 年trustworthy medical AI

NurValues：临床情境中大语言模型的真实护理价值观评测

ICLR 2026 Poster accepted paper at ICLR 2026. While LLMs have demonstrated medical knowledge and conversational ability, their deployment in clinical practice raises new risks: patients may place greater trust in LLM-generated responses than in nurses' professional judgments, potentially intensifying nurse–patient conflicts. Such risks highlight the urgent need of evaluating whether LLMs align with the core nursing values upheld by human nurses. This work introduces the first benchmark for nursing value alignment, consisting of five core value dimensions distilled from international nursing codes: _Altruism_, _Human Dignity_, _Integrity_, _Justice_, and _Professionalism_. We define two-level tasks on the benchmark, considering the two characteristics of emerging nurse–patient conflicts.

医学影像计算临床语言智能 EHR 与临床预测论文 Large language models value alignment 查看论文详情

论文ICLR 2026 Poster2026 年clinical prediction

FETAL-GAUGE：评估胎儿超声视觉语言模型的基准

ICLR 2026 Poster accepted paper at ICLR 2026. The growing demand for prenatal ultrasound imaging has intensified a global shortage of trained sonographers, creating barriers to essential fetal health monitoring. Deep learning has the potential to enhance sonographers' efficiency and support the training of new practitioners. Vision-Language Models (VLMs) are particularly promising for ultrasound interpretation, as they can jointly process images and text to perform multiple clinical tasks within a single framework. However, despite the expansion of VLMs, no standardized benchmark exists to evaluate their performance in fetal ultrasound imaging. Code/project link: https://github.com/BioMedIA-MBZUAI/FETAL-GAUGE

医学影像计算医疗多模态临床语言智能论文 Vision-Language Models Fetal Ultrasound 查看论文详情

论文ICLR 2026 Poster2026 年clinical NLP

多图像医学思维

ICLR 2026 Poster accepted paper at ICLR 2026. Large language models perform well on many medical QA benchmarks, but real clinical reasoning is harder because diagnosis often requires integrating evidence across multiple images rather than interpreting a single view. We introduce MedThinkVQA, an expert-annotated benchmark for thinking with multiple images, in which models must interpret each image, combine cross-view evidence, and solve diagnostic questions under intermediate supervision and step-level evaluation. The dataset contains 10,067 cases, including 720 test cases, with an average of 6.68 images per case, substantially denser than prior work (earlier maxima $\leq$ 1.43). On the test set, the best closed-source models, Claude-4.6-opus, Gemini-3-pro, and GPT-5.2-xhigh, achieve only 54.9%--57.2% accuracy, while smaller proprietary variants, GPT-5-mini/nano, drop to 39.7% and 30.8%.

医学影像计算医疗多模态临床语言智能论文 Multimodal diagnostic reasoning Vision language models (VLMs)查看论文详情

论文ICLR 2026 Poster2026 年trustworthy medical AI

AttTok：将属性 token 与生成式预训练视觉语言模型结合用于医学图像理解

ICLR 2026 Poster accepted paper at ICLR 2026. Recent generative pre-trained vision–language (GPTv) models have achieved remarkable success in multi-modal understanding, inspiring their adaptation to medical imaging tasks such as disease diagnosis and visual question answering (VQA). However, current instruction-tuned GPTv models suffer from two key challenges: (1) medical attributes (e.g., disease names, severity grades) are encoded as plain text tokens, collapsing semantically distinct concepts into nearly identical textual sequences; and (2) inadequate textual supervision weakens visual representation learning, leading to severe inter-attribute confusion and misaligned vision–language embeddings. To address these limitations, we introduce attribute tokens (AttTok), a set of pre‑defined special tokens that uniquely encode clinical attributes (e.g., imaging modality, diagnosis, severity) within a structured token space. Complemented by attribute‑centric embedding books, AttTok serves as anchor points for aligning both visual and textual modalities into a shared, discriminative representation space.

医学影像计算医疗多模态临床语言智能论文 Medical generative pre-trained models medical Multi-Modal alignment 查看论文详情

论文ICLR 2026 Oral2026 年clinical prediction

去中心化注意力错失中心信号：重新思考医学时间序列 Transformer

ICLR 2026 Oral accepted paper at ICLR 2026. Accurate analysis of Medical time series (MedTS) data, such as Electroencephalography (EEG) and Electrocardiography (ECG), plays a pivotal role in healthcare applications, including the diagnosis of brain and heart diseases. MedTS data typically exhibits two critical patterns: **temporal dependencies** within individual channels and **channel dependencies** across multiple channels. While recent advances in deep learning have leveraged Transformer-based models to effectively capture temporal dependencies, they often struggle to model channel dependencies. This limitation stems from a structural mismatch: ***MedTS signals are inherently centralized, whereas the Transformer's attention is decentralized***, making it less effective at capturing global synchronization and unified waveform patterns. Code/project link: https://github.com/Levi-Ackman/TeCh

医学影像计算 EHR 与临床预测论文 EEG ECG 心电 Deep learning 查看论文详情

论文ICLR 2026 Poster2026 年trustworthy medical AI

Cancer-Myth：评估大语言模型回答含错误预设的患者问题

ICLR 2026 Poster accepted paper at ICLR 2026. Cancer patients are increasingly turning to large language models (LLMs) for medical information, making it critical to assess how well these models handle complex, personalized questions. However, current medical benchmarks focus on medical exams or consumer-searched questions and do not evaluate LLMs on real patient questions with patient details. In this paper, we first have three hematology-oncology physicians evaluate cancer-related questions drawn from real patients. While LLM responses are generally accurate, the models frequently fail to recognize or address false presuppositions} in the questions, posing risks to safe medical decision-making.

医学影像计算临床语言智能可信、安全、公平与隐私论文 Medical benchmark LLM evaluation 查看论文详情

论文ICLR 2026 Poster2026 年clinical prediction

能否用 LLM 为临床时间序列数据生成可迁移表征？

ICLR 2026 Poster accepted paper at ICLR 2026. Recent advances in vision-language models (VLMs) have achieved remarkable performance on standard medical benchmarks, yet their true clinical reasoning ability remains unclear. Existing datasets predominantly emphasize classification accuracy, creating an evaluation illusion in which models appear proficient while still failing at high-stakes diagnostic reasoning. We introduce Neural-MedBench, a compact yet reasoning-intensive benchmark specifically designed to probe the limits of multimodal clinical reasoning in neurology. Neural-MedBench integrates multi-sequence MRI scans, structured electronic health records, and clinical notes, and encompasses three core task families: differential diagnosis, lesion recognition, and rationale generation. Code/project link: https://neuromedbench.github.io/

医学影像计算医疗多模态临床语言智能论文 vision-language models benchmark dataset 查看论文详情

论文ICLR 2026 Poster2026 年trustworthy medical AI

超越分类准确率：Neural-MedBench 与深层推理基准的必要性

ICLR 2026 Poster accepted paper at ICLR 2026. Epilepsy affects over 50 million people worldwide, and one-third of patients suffer drug-resistant seizures where surgery offers the best chance of seizure freedom. Accurate localization of the epileptogenic zone (EZ) relies on intracranial EEG (iEEG). Clinical workflows, however, remain constrained by labor-intensive manual review. At the same time, existing data-driven approaches are typically developed on single-center datasets that are inconsistent in format and metadata, lack standardized benchmarks, and rarely release pathological event annotations, creating barriers to reproducibility, cross-center validation, and clinical relevance. Code/project link: https://omni-ieeg.github.io/omni-ieeg/; https://github.com/Omni-iEEG/Omni-iEEG

医学影像计算 EHR 与临床预测可信、安全、公平与隐私论文 Computational neuroscience iEEG 查看论文详情

论文ICLR 2026 Poster2026 年医学影像

面向医学超声的解剖感知表征学习

ICLR 2026 Poster accepted paper at ICLR 2026. Diagnostic accuracy of ultrasound imaging is limited by qualitative variability and its reliance on the expertise of medical professionals. Such challenges increase demand for computer-aided diagnostic systems that enhance diagnostic accuracy and efficiency. However, the unique texture and structural attributes of ultrasound images, and the scarcity of large-scale ultrasound datasets hinder the effective application of conventional machine learning methodologies. To address the challenges, we propose Anatomy-aware Representation Learning (ARL), a novel self-supervised representation learning framework specifically designed for medical ultrasound imaging.

医学影像计算论文 Foundation model medical ultrasound representation learning ICLR 2026 查看论文详情

论文ICLR 2026 Poster2026 年trustworthy medical AI

超越医学考试：面向心理健康真实任务与模糊性的临床医生标注公平性数据集

ICLR 2026 Poster accepted paper at ICLR 2026. Current medical language model (LM) benchmarks often over-simplify the complexities of day-to-day clinical practice tasks and instead rely on evaluating LMs on multiple-choice board exam questions. In psychiatry especially, these challenges are worsened by fairness and bias issues, since models can be swayed by patient demographics even when those factors should not influence clinical decisions. Thus, we present an expert-created and annotated dataset spanning five critical domains of decision-making in mental healthcare: treatment, diagnosis, documentation, monitoring, and triage. This U.S. centric dataset — created without any LM assistance — is designed to capture the nuanced clinical reasoning and daily ambiguities mental health practitioners encounter, reflecting the inherent complexities of care delivery that are missing from existing datasets.

医学影像计算临床语言智能可信、安全、公平与隐私论文 AI for Healthcare mental health 查看论文详情

论文ICLR 2026 Poster2026 年clinical prediction

从病历到诊断对话：面向精神共病的临床扎根方法与数据集

ICLR 2026 Poster accepted paper at ICLR 2026. Psychiatric comorbidity is clinically significant yet challenging due to the complexity of multiple co-occurring disorders. To address this, we develop a novel approach integrating synthetic patient electronic medical record (EMR) construction and multi-agent diagnostic dialogue generation. We create 502 synthetic EMRs for common comorbid conditions using a pipeline that ensures clinical relevance and diversity. Our multi-agent framework transfers the clinical interview protocol into a hierarchical state machine and context tree, supporting over 130 diagnostic states while maintaining clinical standards.

医学影像计算临床语言智能 EHR 与临床预测论文 Psychiatric Comorbidity Diagnostic Dialogue 查看论文详情

论文ICLR 2026 Poster2026 年trustworthy medical AI

AbdCTBench：从腹部表面几何学习临床生物标志物表征

ICLR 2026 Poster accepted paper at ICLR 2026. Body composition analysis through CT and MRI imaging provides critical insights for cardio-metabolic health assessment but remains limited by accessibility barriers including radiation exposure, high costs, and infrastructure requirements. We present AbdCTBench, a large-scale dataset containing 23,506 CT-derived abdominal surface meshes from 18,719 patients, paired with 87 comorbidity labels, 31 specific diagnosis codes, and 16 CT-derived biomarkers. Our key insight is that external surface geometry is predictive of internal tissue composition, enabling accessible health screening through consumer devices. We establish comprehensive benchmarks across seven computer vision architectures (ResNet-18/34/50, DenseNet-121, EfficientNet-B0, ViT-Small, Swin Transformer-Base), demonstrating that models can learn robust surface-to-biomarker representations directly from 2D mesh projections. Code/project link: https://abdctbenchrepo.github.io/AbdCTBench/

医学影像计算可信、安全、公平与隐私论文 computer vision for healthcare 放射影像 Computed Tomography (CT)查看论文详情

论文ICLR 2026 Poster2026 年可信、安全、公平与隐私

超越医学考试：面向心理健康真实任务与模糊性的临床医生标注公平性数据集

ICLR 2026 Poster 论文提出 MENTAT：一个由临床专家创建和标注、面向心理健康真实任务与模糊性的公平性评测数据集，用于评估语言模型在临床决策任务中的表现与偏差。

可信、安全、公平与隐私医疗 AI 论文会议论文查看论文详情

论文Medical Image Analysis2026 年医学影像计算

Ark+：由多套异构标注数据监督训练单一高性能 AI 基础模型，无需标签整合

一种面向多源异构标注数据的监督训练方法，用于训练单一高性能 AI 基础模型，避免人工标签整合。

医疗 AI 论文期刊论文查看论文详情

数据资源critical care time-series variables and outcomesICU time-series benchmark datasetPhysioNet Challenge 2012 dataset; version 1.0.0开放访问

PhysioNet/CinC 2012 ICU 时间序列数据集

The PhysioNet/CinC Challenge 2012 dataset contains ICU time-series records used for mortality prediction and patient-specific outcome modeling. It remains a useful benchmark for clinical time-series modeling, missingness-aware learning, and early warning model development.

EHR 与临床预测数据集重症监护 time series mortality prediction PhysioNet Challenge 查看数据资源

数据资源Chinese community medical questions and answersChinese medical QA datasetUpdated cMedQA dataset; see official repository开放访问

cMedQA2：中文社区医学问答数据集

cMedQA2 is an updated Chinese community medical question answering dataset for question-answer matching and medical QA research. It is useful for training and evaluating Chinese medical retrieval, ranking, and answer selection models.

临床语言智能数据集 Chinese medical QA answer selection community QA medical_llm_agent 查看数据资源

数据资源abdominal CT and MRI with multi-organ annotationsabdominal multi-organ segmentation benchmarkAMOS 2022 challenge benchmark; see official Grand Challenge page申请访问

AMOS 腹部多器官分割基准

AMOS is an abdominal multi-organ segmentation benchmark with CT and MRI cases for evaluating versatile medical image segmentation models. It supports abdominal organ segmentation, modality-general segmentation, and benchmarking of robust 3D segmentation methods.

医学影像计算数据集 abdominal imaging multi-organ segmentation CT MRI 查看数据资源

数据资源retinal fundus photographs with glaucoma and structure annotationsophthalmology fundus image challenge datasetREFUGE challenge dataset; official splits described on Grand Challenge申请访问

REFUGE 视网膜眼底青光眼挑战数据集

REFUGE is a retinal fundus imaging challenge dataset for glaucoma assessment. It supports glaucoma classification, optic disc and cup segmentation, fovea localization, and fair comparison of ophthalmology AI methods on color fundus photographs.

医学影像计算数据集 ophthalmology fundus glaucoma 分割查看数据资源

数据资源chest radiographs with pneumonia/lung opacity annotationschest X-ray pneumonia detection challenge datasetRSNA 2018 AI image challenge dataset开放访问

RSNA 肺炎检测挑战数据集

The RSNA Pneumonia Detection Challenge dataset is a chest radiograph benchmark for detecting pneumonia-related lung opacities. It supports object detection, chest X-ray classification, localization, and radiology AI evaluation under a competition framework.

医学影像计算数据集 CXR pneumonia object detection RSNA 查看数据资源

数据资源upper extremity radiographs with abnormality labelsmusculoskeletal X-ray datasetLarge Stanford musculoskeletal radiograph dataset申请访问

MURA 肌骨 X 光数据集

MURA is a musculoskeletal radiograph dataset from Stanford for abnormality detection in upper extremity X-rays. It is used for radiology classification, fracture-related screening, musculoskeletal imaging AI, and human-AI comparison studies.

医学影像计算数据集 X-ray musculoskeletal abnormality detection Stanford AIMI 查看数据资源

数据资源cine cardiac MRI with segmentation labelscardiac MRI segmentation datasetACDC challenge dataset; see official database page申请访问

ACDC 自动心脏诊断挑战数据集

ACDC is a cardiac MRI dataset for automated cardiac diagnosis and segmentation. It supports left and right ventricular segmentation, myocardium segmentation, cardiac function quantification, and evaluation of robust cardiac image analysis methods.

医学影像计算数据集 cardiac MRI 分割 ventricle challenge 查看数据资源

数据资源MRI, DXA, ultrasound, retinal imaging, genetics, and health recordspopulation-scale multimodal imaging cohortPopulation-scale UK Biobank imaging cohort; application required申请访问

UK Biobank 影像数据

UK Biobank Imaging provides large-scale imaging phenotypes linked to genetic, lifestyle, and health outcome data. It is used for population-scale medical imaging AI, disease risk prediction, representation learning, multimodal biomedical modeling, and epidemiological AI studies.

医学影像计算 EHR 与临床预测医疗多模态数据集 population cohort imaging 查看数据资源

数据资源genomics, transcriptomics, clinical metadata, and pathology-related datacancer genomics and clinical datasetLarge multi-cancer TCGA program dataset开放访问

TCGA 癌症基因组数据集

The Cancer Genome Atlas is a large cancer genomics resource with molecular, clinical, and pathology-related data across many cancer types. It is a foundation dataset for oncology AI, survival prediction, subtype discovery, multimodal cancer modeling, and translational biomarker research.

EHR 与临床预测医疗多模态数据集 cancer genomics 肿瘤学 multi-omics 查看数据资源

数据资源brain MRI with demographic and clinical variablesbrain MRI and neuroimaging dataset collectionOASIS cross-sectional and longitudinal releases; see official site开放访问

OASIS 脑 MRI 与神经影像数据集

OASIS provides open-access neuroimaging datasets for studying normal aging, dementia, and brain structure. It is useful for brain MRI segmentation, age prediction, dementia classification, longitudinal modeling, and neuroimaging method benchmarking.

医学影像计算 EHR 与临床预测数据集 brain MRI dementia aging 查看数据资源

数据资源MRI, PET, biomarkers, clinical and cognitive assessmentslongitudinal neuroimaging and clinical datasetLongitudinal ADNI cohort data; access through ADNI/LONI申请访问

ADNI 阿尔茨海默病神经影像倡议数据集

ADNI provides longitudinal neuroimaging, biomarker, clinical, and cognitive data for Alzheimer disease research. It supports disease progression modeling, dementia diagnosis, multimodal prediction, biomarker discovery, and clinical translation studies.

医学影像计算 EHR 与临床预测医疗多模态数据集 Alzheimer disease neuroimaging 查看数据资源

数据资源cardiac ultrasound videos with functional annotationsechocardiography video datasetLarge echocardiography video dataset; see official site申请访问

EchoNet-Dynamic 心脏超声视频数据集

EchoNet-Dynamic is a cardiac ultrasound video dataset with expert annotations for left ventricular function. It is used for echocardiography video understanding, ejection fraction estimation, cardiac segmentation, and clinical video AI research.

医学影像计算数据集 echocardiography ultrasound video cardiology 查看数据资源

数据资源histopathology whole-slide imagesdigital pathology whole-slide image datasetCAMELYON17 challenge dataset; see Grand Challenge page申请访问

CAMELYON17 组织病理淋巴结转移数据集

CAMELYON17 is a digital pathology dataset for detecting breast cancer metastases in lymph node whole-slide images across multiple centers. It supports pathology classification, metastasis detection, weakly supervised learning, and domain generalization in histopathology AI.

医学影像计算数据集 pathology whole-slide imaging breast cancer domain generalization 查看数据资源

数据资源dermoscopic and clinical skin lesion imagesdermatology image archiveLarge public ISIC dermatology image archive开放访问

ISIC Archive 皮肤病学图像数据集

The ISIC Archive is a large public dermatology image repository for skin lesion analysis. It is widely used for melanoma classification, lesion segmentation, dermoscopic image retrieval, bias and domain shift analysis, and clinical imaging benchmark development.

医学影像计算数据集皮肤病学 skin lesion melanoma ISIC 查看数据资源

数据资源raw MRI k-space and reconstructed MRI dataMRI reconstruction datasetLarge raw MRI reconstruction dataset; see official site申请访问

fastMRI 原始 MRI 重建数据集

fastMRI is a raw MRI dataset for accelerated magnetic resonance image reconstruction, originally released by NYU Langone Health and Meta AI. It is used for MRI reconstruction, compressed sensing replacement, generative reconstruction, and robustness evaluation.

医学影像计算数据集 MRI 重建 k-space accelerated imaging 查看数据资源

数据资源2D and 3D biomedical imagesstandardized biomedical image benchmark12 2D datasets and 6 3D datasets in MedMNIST v2开放访问

MedMNIST v2 生物医学图像基准

MedMNIST v2 is a standardized collection of lightweight biomedical image classification datasets, including 2D and 3D tasks. It is useful for quick benchmarking, AutoML, foundation model sanity checks, and reproducible evaluation across multiple medical imaging domains.

医学影像计算数据集 MedMNIST 分类 benchmark 2D 3D imaging 查看数据资源

数据资源multimodal brain MRI with tumor annotationsbrain tumor MRI segmentation challenge datasetBraTS 2024 challenge dataset; see Synapse project申请访问

BraTS 2024 脑肿瘤分割挑战数据集

BraTS 2024 provides multimodal brain MRI data and expert annotations for brain tumor segmentation and related tumor subregion analysis. It is a major benchmark for glioma segmentation, radiology AI, and robust multimodal MRI segmentation methods.

医学影像计算数据集 brain MRI glioma 分割 BraTS 查看数据资源

数据资源abdominal CT with kidney and tumor annotationskidney tumor CT segmentation datasetTCIA C4KC-KiTS collection; see collection page开放访问

C4KC-KiTS 肾肿瘤分割集合

C4KC-KiTS is a TCIA imaging collection associated with kidney and kidney tumor segmentation benchmarks. It supports kidney segmentation, renal tumor segmentation, surgical planning research, and evaluation of abdominal CT segmentation models.

医学影像计算数据集 kidney tumor CT segmentation TCIA KiTS 查看数据资源

数据资源thoracic CT images with nodule annotationslung CT nodule datasetTCIA LIDC-IDRI collection开放访问

LIDC-IDRI 肺部 CT 结节数据集

LIDC-IDRI is a lung CT dataset with thoracic CT scans and expert nodule annotations. It is a classic benchmark for lung nodule detection, segmentation, malignancy characterization, radiomics, and computer-aided diagnosis research.

医学影像计算数据集 lung CT nodules 分割 TCIA 查看数据资源

数据资源chest radiographs with radiologist annotationschest X-ray detection and classification datasetVinDr-CXR release on PhysioNet; version 1.0.0开放访问

VinDr-CXR：越南胸部 X 光数据集

VinDr-CXR is a chest X-ray dataset with radiologist annotations from Vietnamese hospitals. It supports abnormality classification, lesion localization, radiology object detection, and robustness studies across clinical sites and populations.

医学影像计算数据集 CXR radiologist labels object detection Vietnam 查看数据资源

数据资源frontal chest radiographs with image-level labelschest X-ray classification datasetNIH public ChestX-ray14 release开放访问

NIH ChestX-ray14 数据集

NIH ChestX-ray14 is a public chest radiograph dataset with image-level labels for thoracic disease findings mined from reports. It is commonly used for chest X-ray classification, weak supervision, thoracic disease detection, and radiology benchmark comparisons.

医学影像计算数据集 CXR thoracic disease weak labels NIH 查看数据资源

数据资源chest radiographs with multi-label findingschest X-ray classification datasetLarge-scale Stanford chest X-ray dataset申请访问

CheXpert 胸部 X 光数据集

CheXpert is a large chest radiograph dataset from Stanford with uncertainty-aware labels for common chest X-ray findings. It is widely used for radiology classification, label uncertainty modeling, chest X-ray representation learning, and clinical imaging benchmarks.

医学影像计算数据集 CXR 放射影像分类 Stanford AIMI 查看数据资源

数据资源EEG and polysomnography biosignalssleep physiology signal datasetExpanded Sleep-EDF PhysioNet dataset; version 1.0.0开放访问

Sleep-EDF Expanded 多导睡眠图数据集

Sleep-EDF Expanded contains polysomnographic sleep recordings with EEG and related physiological signals. It is used for sleep stage classification, biosignal time-series modeling, self-supervised learning on physiological signals, and clinical sleep research benchmarks.

EHR 与临床预测数据集 sleep staging EEG biosignal PhysioNet 查看数据资源

数据资源12-lead ECG waveforms with diagnostic labelsECG waveform benchmarkLarge public ECG dataset; version 1.0.3开放访问

PTB-XL：大型开放 12 导联 ECG 数据集

PTB-XL is a large public 12-lead electrocardiography dataset with diagnostic statements and waveform records. It is a standard benchmark for ECG classification, cardiac abnormality detection, clinical signal representation learning, and robust evaluation of biosignal models.

EHR 与临床预测数据集 ECG 心电 cardiology biosignal 分类查看数据资源

数据资源structured critical care EHR tablesmulticenter ICU EHR datasetMulticenter ICU database; version 2.0申请访问

eICU 协作研究数据库

The eICU Collaborative Research Database is a multicenter critical care database containing deidentified ICU data from many hospitals. It is commonly used for external validation, ICU outcome prediction, temporal modeling, and cross-site generalization studies in clinical AI.

EHR 与临床预测数据集重症监护电子病历 multicenter external validation 查看数据资源

数据资源12-lead ECG waveforms and diagnostic metadataECG waveform datasetLarge-scale diagnostic ECG dataset; version 1.0申请访问

MIMIC-IV-ECG 诊断心电图数据集

MIMIC-IV-ECG is a large deidentified electrocardiogram dataset linked to the MIMIC-IV clinical data ecosystem. It supports ECG classification, arrhythmia detection, representation learning, and multimodal modeling with structured EHR context.

EHR 与临床预测数据集 ECG 心电 biosignal clinical prediction PhysioNet 查看数据资源

数据资源chest radiographs with radiology reportschest X-ray image-report datasetLarge-scale CXR image-report dataset; version 2.1.0申请访问

MIMIC-CXR v2.1.0 胸部 X 光数据集

MIMIC-CXR is a large deidentified chest radiograph dataset with associated free-text radiology reports. It is widely used for chest X-ray classification, report generation, image-text representation learning, radiology retrieval, and medical multimodal foundation model evaluation.

医学影像计算医疗多模态临床语言智能数据集 CXR radiology reports 查看数据资源

数据资源deidentified clinical free textclinical notes datasetClinical note extension for MIMIC-IV; version 2.2申请访问

MIMIC-IV-Note v2.2 临床笔记数据集

MIMIC-IV-Note provides deidentified clinical notes linked to MIMIC-IV hospital data. It supports clinical NLP tasks such as note representation learning, discharge summary modeling, information extraction, summarization, and multimodal EHR-text modeling.

临床语言智能 EHR 与临床预测数据集 clinical NLP notes summarization 查看数据资源

数据资源deidentified structured EHR tablescritical care EHR datasetLarge-scale hospital and ICU EHR dataset; version 3.1申请访问

MIMIC-IV v3.1 重症监护与住院 EHR 数据集

MIMIC-IV is a large deidentified electronic health record dataset from Beth Israel Deaconess Medical Center, covering hospital and ICU data for critical care research. It is a core benchmark source for clinical prediction, temporal EHR modeling, phenotyping, and healthcare AI method development.

EHR 与临床预测数据集电子病历重症监护 clinical prediction PhysioNet 查看数据资源

数据资源medical images with bilingual visual questions and answersmedical visual question answering datasetBilingual medical VQA dataset; see official project page开放访问

SLAKE：语义标注、知识增强医学 VQA 数据集

SLAKE is a semantically labeled medical visual question answering dataset with bilingual English-Chinese questions, medical images, and knowledge-enhanced annotations. It is useful for medical multimodal learning, image-grounded QA, and radiology VQA evaluation.

医学影像计算医疗多模态数据集 medical VQA bilingual dataset medical multimodal 查看数据资源

数据资源Chinese conversational medical QA textChinese medical conversational QA datasetLarge-scale Chinese medical CQA dataset; see official repository开放访问

CMCQA：中文医学会话问答数据集

CMCQA is a large Chinese medical conversational question-answering dataset released with knowledge-grounded medical dialogue research. It supports medical conversation QA, knowledge-grounded response generation, and evaluation of Chinese medical dialogue systems.

临床语言智能数据集 Chinese medical CQA knowledge-grounded dialogue medical chatbot medical_llm_agent 查看数据资源

数据资源Chinese medical instruction and dialogue textChinese medical instruction-tuning datasetAbout 140K medical SFT examples; see Hugging Face card开放访问

HuatuoGPT2-SFT-GPT4-140K 医学指令数据集

HuatuoGPT2-SFT-GPT4-140K is a Chinese medical supervised fine-tuning dataset containing medical instruction-style conversations and GPT-4-assisted responses. It is useful for Chinese medical assistant alignment and medical LLM instruction tuning.

临床语言智能数据集 medical SFT Chinese medical LLM instruction tuning medical_llm_agent 查看数据资源

数据资源Chinese medical question-answer textChinese medical QA corpusAbout 26 million medical QA pairs开放访问

Huatuo-26M：大规模中文医学问答数据集

Huatuo-26M is a large-scale Chinese medical question-answering dataset with about 26 million QA pairs collected for medical language modeling and medical dialogue research. It is suitable for Chinese medical LLM pretraining, fine-tuning, and QA system development.

临床语言智能数据集 Chinese medical QA large-scale corpus LLM training medical_llm_agent 查看数据资源

数据资源medical exam question-answer textmedical exam QA benchmarkUSMLE, Mainland China, and Taiwan exam-style QA splits; see repository开放访问

MedQA：含美国、中国大陆与台湾拆分的医学考试问答数据集

MedQA is a medical examination question answering benchmark with English and Chinese medical licensing-style question sets, including mainland China and Taiwan variants. It is widely used for medical QA and medical reasoning evaluation.

临床语言智能数据集 medical QA Chinese exam QA USMLE LLM evaluation 查看数据资源

数据资源Chinese consultation dialogue text with medical entity annotationsChinese medical dialogue generation datasetEntity-annotated dialogue dataset; see official repository开放访问

MedDG：实体中心中文医学对话生成数据集

MedDG is an entity-centric Chinese medical consultation dataset with domain entity annotations for medical dialogue generation. It supports entity-aware response generation, medical consultation modeling, and dialogue systems that ground responses in clinical concepts.

临床语言智能数据集 Chinese medical dialogue entity annotation generation medical_llm_agent 查看数据资源

数据资源Chinese medical exam and QA textChinese medical LLM evaluation benchmarkMultiple Chinese medical exam and benchmark splits; see Hugging Face card开放访问

CMB：中文医学基准

CMB is a comprehensive Chinese medical benchmark for evaluating medical large language models on medical exams, reasoning, and clinical knowledge questions. It is suited for Chinese medical QA, LLM evaluation, and instruction-following assessment.

临床语言智能数据集 Chinese medical benchmark medical LLM exam QA medical_llm_agent 查看数据资源

数据资源Chinese biomedical and clinical textChinese biomedical NLP benchmark8 biomedical NLU tasks; see official repository开放访问

CBLUE：中文生物医学语言理解评测基准

CBLUE is a Chinese biomedical language understanding benchmark covering real-world biomedical NLP tasks such as named entity recognition, relation extraction, term normalization, clinical trial classification, sentence similarity, and medical question answering. It is useful for evaluating Chinese clinical NLP models and medical language models.

临床语言智能数据集 Chinese medical NLP benchmark information extraction QA 查看数据资源

数据资源CT癌症影像TCIA collection申请访问

National Lung Screening Trial 数据集合

Low-dose CT imaging collection from the National Lung Screening Trial, distributed by The Cancer Imaging Archive.

CT 肺癌 TCIA 查看数据资源

数据资源CT/MRI分割基准10 segmentation tasks开放访问

Medical Segmentation Decathlon 医学分割十项全能

Legacy multi-task biomedical image segmentation benchmark retained as a reference; newer segmentation benchmarks are listed above it.

分割 CT MRI legacy reference 查看数据资源

数据资源胸部 X 光放射影像112,120 frontal-view X-ray images开放访问

NIH ChestX-ray14 数据集

NIH Clinical Center chest X-ray dataset released for computer-aided detection and radiology machine learning research.

放射影像 NIH 胸部 X 光查看数据资源

数据资源胸部 X 光放射影像224,316 chest radiographs申请访问

CheXpert

Stanford chest radiograph dataset for automated chest X-ray interpretation and uncertainty-aware label evaluation.

放射影像胸部 X 光斯坦福查看数据资源

数据资源ECG 心电生理信号21,837 clinical 12-lead ECG records开放访问

PTB-XL ECG 数据库 v1.0.3

Large publicly available 12-lead ECG waveform dataset with diagnostic labels, hosted on PhysioNet.

ECG 心电生理信号 PhysioNet 查看数据资源

数据资源胸部 X 光放射影像PhysioNet v2.1.0受限访问

MIMIC-CXR-JPG v2.1.0

JPG-formatted chest radiographs with labels derived from free-text reports, hosted by PhysioNet.

放射影像胸部 X 光 PhysioNet 查看数据资源

数据资源电子病历重症监护与住院记录PhysioNet v3.1受限访问

MIMIC-IV 临床数据库 v3.1

Deidentified EHR data for ICU and hospital patients at Beth Israel Deaconess Medical Center, distributed through PhysioNet with credentialed access.

电子病历重症监护 PhysioNet 查看数据资源

数据资源TextLLM benchmarkBenchmark and leaderboard开放访问

MedHELM 医学 LLM 评测基准

Medical LLM benchmark and leaderboard intended to broaden coverage beyond single medical QA datasets.

benchmark leaderboard medical LLM 查看数据资源

数据资源医学影像分割基准IMed-361M / IMIS-Bench开放访问

IMed-361M / IMIS-Bench 交互式医学图像分割基准

Interactive medical image segmentation benchmark and baseline from CVPR 2025, covering multiple modalities, organs, and target structures.

interactive segmentation benchmark 医学影像查看数据资源

数据资源Multimodal clinical dataBenchmarkICML 2025 benchmark开放访问

CLIMB 临床基础模型基准

Multimodal clinical data foundation and benchmark introduced at ICML 2025 for clinical foundation model research.

benchmark 多模态 clinical foundation model 查看数据资源

技术竞赛Open soonperipelvic fracture segmentation and reduction planningpelvic fracture CT imaging截止北京时间 2026-08-19

骨盆周围骨折分割与复位规划挑战

Grand Challenge official API lists this medical AI challenge with status OPEN_SOON. Peripelvic fractures are severe injuries with high disability and mortality rates. The PENGWIN 2026 Challenge aims to advance state-of-the-art techniques for intelligent surgical planning in 3D CT scans. It consists of three tasks: fully automated peripelvic fracture segmentation (Task 1), interactive segmentation (Task 2), and fracture reduction planning (Task 3). The dataset features 500 clinical cases with expert annotations and 16,000 simulated fracture cases to support the training of data-driven reduction models. Start date: 2026-05-10. End/deadline date: 2026-08-19.

医学影像计算竞赛 Grand Challenge 平台 PENGWIN2026 OPEN_SOON peripelvic fracture segmentation and reduction planning 查看竞赛详情

征稿与合作npj Digital Medicine截止北京时间 2026-07-21期刊专刊

npj Digital Medicine 专辑：运动医学中的人工智能

This Nature Portfolio / npj Digital Medicine collection is open for submissions until 2026-07-21. It invites research on AI in sports medicine, including multimodal injury and medical-condition prediction, individualized diagnosis, treatment and rehabilitation, transparent and diverse datasets, open-source explainable AI, and safe AI systems for athlete and exercise health.

医疗多模态可信、安全、公平与隐私征稿 Nature Portfolio npj Digital Medicine sports medicine 查看征稿详情