AI4Meder 站内搜索

搜索医学 AI 论文与资源

按论文、数据资源、技术竞赛、投稿截止日期和课程资源检索社区内容，快速进入对应详情页。

52 条结果

输入关键词或点击标签，按论文、数据资源、竞赛截止日期、征稿与课程缩小范围。标签：分类

论文ICLR 2026 Poster2026 年trustworthy medical AI

Dyslexify：CLIP 中抵御排版攻击的机制性防御

ICLR 2026 Poster accepted paper at ICLR 2026. Typographic attacks exploit multi-modal systems by injecting text into images, leading to targeted misclassifications, malicious content generation and even Vision-Language Model jailbreaks. In this work, we analyze how CLIP vision encoders behave under typographic attacks, locating specialized attention heads in the latter half of the model's layers that causally extract and transmit typographic information to the cls token. Building on these insights, we introduce Dyslexify - a method to defend CLIP models against typographic attacks by selectively ablating a typographic circuit, consisting of attention heads. Without requiring finetuning, dyslexify improves performance by up to 22.06\% on a typographic variant of ImageNet-100, while reducing standard ImageNet-100 accuracy by less than 1\%, and demonstrate its utility in a medical foundation model for skin lesion diagnosis.

医学影像计算医疗多模态临床语言智能论文 Multimodality Circuit analysis 查看论文详情

论文ICLR 2026 Poster2026 年clinical prediction

视频理解中的人脑：动态专家混合模型

ICLR 2026 Poster accepted paper at ICLR 2026. The human brain is the most efficient and versatile system for processing dynamic visual input. By comparing representations from deep video models to brain activity, we can gain insights into mechanistic solutions for effective video processing, important to better understand the brain and to build better models. Current works in model-brain alignment primarily focus on fMRI measurements, leaving open questions about fine-grained dynamic processing. Here, we introduce the first large-scale model benchmarking on alignment to dynamic electroencephalography (EEG) recordings of short natural videos. We analyze 100+ models across the axes of temporal integration, classification task, architecture, and pretraining, using our proposed Cross-Temporal Representational Similarity Analysis (CT-RSA) which matches the best time-unfolded model features to dynamically evolving brain responses, distilling $10^7$ alignment scores.

医学影像计算 EHR 与临床预测论文 representational alignment Representational Similarity Analysis RSA 查看论文详情

论文ICLR 2026 Poster2026 年trustworthy medical AI

先验感知与上下文引导的主动概率子采样分组

ICLR 2026 Poster accepted paper at ICLR 2026. Subsampling significantly reduces the number of measurements, thereby streamlining data processing and transfer overhead, and shortening acquisition time across diverse real-world applications. The recently introduced Active Deep Probabilistic Subsampling (A-DPS) approach jointly optimizes both the subsampling pattern and the downstream task model, enabling instance- and subject-specific sampling trajectories and effective adaptation to new data at inference time. However, this approach does not fully leverage valuable dataset priors and relies on top-1 sampling, which can impede the optimization process. Herein, we enhance A-DPS by integrating a deterministic (fixed) prior-informed sampling pattern derived from the training dataset, along with group-based sampling via top-k sampling, to achieve more robust optimization—method we call Prior-aware and context-guided Group-based Active DPS (PGA-DPS).

医学影像计算可信、安全、公平与隐私论文 Subsampling Active acquisition Accelerated MRI 查看论文详情

论文ICLR 2026 Poster2026 年trustworthy medical AI

利用特征低维流形实现少样本全切片图像分类

ICLR 2026 Poster accepted paper at ICLR 2026. Few-shot Whole Slide Image (WSI) classification is severely hampered by overfitting. We argue that this is not merely a data-scarcity issue but a fundamentally geometric problem. Grounded in the manifold hypothesis, our analysis shows that features from pathology foundation models exhibit a low-dimensional manifold geometry that is easily perturbed by downstream models. This insight reveals a key potential issue in downstream multiple instance learning models: linear layers are geometry-agnostic and, as we show empirically, can distort the manifold geometry of the features. To address this, we propose the Manifold Residual (MR) block, a plug-and-play module that is explicitly geometry-aware. Code/project link: https://github.com/BearCleverProud/MR-Block

医学影像计算可信、安全、公平与隐私论文 Computational Pathology Whole Slide Image Classification Few-shot Learning 查看论文详情

论文ICLR 2026 Poster2026 年医学影像

Disco：通过邻接感知协同着色实现密集重叠细胞实例分割

ICLR 2026 Poster accepted paper at ICLR 2026. Accurate cell instance segmentation is foundational for digital pathology analysis. Existing methods based on contour detection and distance mapping still face significant challenges in processing complex and dense cellular regions. Graph coloring-based methods provide a new paradigm for this task, yet the effectiveness of this paradigm in real-world scenarios with dense overlaps and complex topologies has not been verified. Addressing this issue, we release a large-scale dataset GBC-FS 2025, which contains highly complex and dense sub-cellular nuclear arrangements. We conduct the first systematic analysis of the chromatic properties of cell adjacency graphs across four diverse datasets and reveal an important discovery: most real-world cell graphs are non-bipartite, with a high prevalence of odd-length cycles (predominantly triangles).

医学影像计算论文 Cell Instance Segmentation Digital Pathology Graph Coloring Topological Analysis 查看论文详情

论文ICLR 2026 Poster2026 年clinical prediction

基于多变量并行注意力生成神经元活动的基础模型

ICLR 2026 Poster accepted paper at ICLR 2026. Learning from multi-variate time-series with heterogeneous channel configurations remains a fundamental challenge for deep neural networks, particularly in clinical domains such as intracranial electroencephalography (iEEG), where channel setups vary widely across subjects. In this work, we introduce multi-variate parallel attention (MVPA), a novel self-attention mechanism that disentangles content, temporal, and spatial attention, enabling flexible, generalizable, and efficient modeling of time-series data with varying channel counts and configurations. We use MVPA to build MVPFormer, a generative foundation model for human electrophysiology, trained to predict the evolution of iEEG signals across diverse subjects. To support this and future efforts by the community, we release the SWEC iEEG dataset, the largest publicly available iEEG dataset to date, comprising nearly 10,000 hours of recordings from heterogeneous clinical sources. Code/project link: https://github.com/IBM/multi-variate-parallel-transformer; https://huggingface.co/datasets/NeuroTec/SWEC_iEEG_Dataset

EHR 与临床预测论文 time-series ieeg neurology 基础模型查看论文详情

论文ICLR 2026 Poster2026 年clinical prediction

基于脉冲的数字大脑：脑活动分析的新型基础模型

ICLR 2026 Poster accepted paper at ICLR 2026. Modeling the temporal dynamics of the human brain remains a core challenge in computational neuroscience and artificial intelligence. Traditional methods often ignore the biological spike characteristics of brain activity and find it difficult to reveal the dynamic dependencies and causal interactions between brain regions, limiting their effectiveness in brain function research and clinical applications. To address this issue, we propose a Spike-based Digital Brain (Spike-DB), a novel fundamental model that introduces the spike computing paradigm into brain time series modeling. Spike-DB encodes fMRI signals as spike trains and learns the temporal driving relationships between anchor and target regions to achieve high-precision prediction of brain activity and reveal underlying causal dependencies and dynamic relationship characteristics. Code/project link: https://github.com/UAIBC-Brain/Spike-DB

医学影像计算 EHR 与临床预测论文 Brain activity Fundamental model Spike computing 查看论文详情

论文ICLR 2026 Poster2026 年trustworthy medical AI

Dual-Kernel Adapter：拓展数据受限医学图像分析的空间视野

ICLR 2026 Poster accepted paper at ICLR 2026. Adapters have become a widely adopted strategy for efficient fine-tuning of foundation models, particularly in resource-constrained settings. However, their performance under extreme data scarcity—common in medical imaging due to high annotation costs, privacy regulations, and fragmented datasets—remains underexplored. In this work, we present the first comprehensive study of adapter-based fine-tuning for vision foundation models in low-data medical imaging scenarios. We find that, contrary to their promise, conventional Adapters can degrade performance under severe data constraints, performing even worse than simple linear probing when trained on less than 1\% of the corresponding training data.

医学影像计算可信、安全、公平与隐私论文 Adapter Medical Image Analysis Data-Limited Training 查看论文详情

论文ICLR 2026 Poster2026 年trustworthy medical AI

SE-Diff：面向综合 ECG 生成的模拟器与经验增强扩散模型

ICLR 2026 Poster accepted paper at ICLR 2026. Cardiovascular disease (CVD) is a leading cause of mortality worldwide. Electrocardiograms (ECGs) are the most widely used non-invasive tool for cardiac assessment, yet large, well-annotated ECG corpora are scarce due to cost, privacy, and workflow constraints. Generating ECGs can aid mechanistic understanding of cardiac electrical activity, enable the construction of large, heterogeneous, and unbiased datasets, and facilitate privacy-preserving data sharing. Generating realistic ECG signals from clinical context is important yet underexplored. Recent work has leveraged diffusion models for text-to-ECG generation, but two challenges remain: (i) existing methods often overlook physiological simulator knowledge of cardiac activity; and (ii) they ignore broader, experience-based clinical knowledge grounded in real-world practice.

医学影像计算临床语言智能 EHR 与临床预测论文 Diffusion Model ECG 心电查看论文详情

论文ICLR 2026 Poster2026 年trustworthy medical AI

ProstaTD：将手术 triplet 从分类桥接到全监督检测

ICLR 2026 Poster accepted paper at ICLR 2026. Surgical triplet detection is a critical task in surgical video analysis, with significant implications for performance assessment and training novice surgeons. However, existing datasets like CholecT50 lack precise spatial bounding box annotations, rendering triplet classification at the image level insufficient for practical applications. The inclusion of bounding box annotations is essential to make this task meaningful, as they provide the spatial context necessary for accurate analysis and improved model generalizability. To address these shortcomings, we introduce ProstaTD, a large-scale, multi-institutional dataset for surgical triplet detection, developed from the technically demanding domain of robot-assisted prostatectomy.

医学影像计算可信、安全、公平与隐私论文 Surgical Triplet Endoscopy 检测查看论文详情

论文ICLR 2026 Poster2026 年clinical prediction

拼合心智马赛克：迈向 EEG 语义意图解码

ICLR 2026 Poster accepted paper at ICLR 2026. Enabling natural communication through brain–computer interfaces (BCIs) remains one of the most profound challenges in neuroscience and neurotechnology. While existing frameworks offer partial solutions, they are constrained by oversimplified semantic representations and a lack of interpretability. To overcome these limitations, we introduce **Semantic Intent Decoding(SID)**, a novel framework that translates neural activity into natural language by modeling meaning as a flexible set of compositional semantic units. SID is built on three core principles: semantic compositionality, continuity and expandability of semantic space, and fidelity in reconstruction.

医学影像计算 EHR 与临床预测论文 Electroencephalography (EEG)Brain-computer interface (BCI)Semantic Intent 查看论文详情

论文ICLR 2026 Poster2026 年clinical prediction

基于平衡符号图算法展开的轻量级 EEG 分类 Transformer

ICLR 2026 Poster accepted paper at ICLR 2026. Samples of brain signals collected by EEG sensors have inherent anti-correlations that are well modeled by negative edges in a finite graph. To differentiate epilepsy patients from healthy subjects using collected EEG signals, we build lightweight and interpretable transformer-like neural nets by unrolling a spectral denoising algorithm for signals on a balanced signed graph---graph with no cycles of odd number of negative edges. A balanced signed graph has well-defined frequencies that map to a corresponding positive graph via similarity transform of the graph Laplacian matrices. We implement an ideal low-pass filter efficiently on the mapped positive graph via Lanczos approximation, where the optimal cutoff frequency is learned from data.

医学影像计算 EHR 与临床预测论文 balanced signed graph spectral denoising graph classification 查看论文详情

论文ICLR 2026 Poster2026 年clinical prediction

基于小波图像变换与谱流匹配的功能 MRI 时间序列生成，用于脑疾病识别

ICLR 2026 Poster accepted paper at ICLR 2026. Functional Magnetic Resonance Imaging (fMRI) provides non-invasive access to dynamic brain activity by measuring blood oxygen level-dependent (BOLD) signals over time. However, the resource-intensive nature of fMRI acquisition limits the availability of high-fidelity samples required for data-driven brain analysis models. While modern generative models can synthesize fMRI data, they often remain challenging in replicating their inherent non-stationarity, intricate spatiotemporal dynamics, and physiological variations of raw BOLD signals. To address these challenges, we propose Dual-Spectral Flow Matching (DSFM), a novel fMRI generative framework that cascades dual frequency representation of BOLD signals with spectral flow matching. Code/project link: https://anonymous.4open.science/r/DSFM-123C; https://anonymous.4open.science/r/DSFM-

医学影像计算 EHR 与临床预测论文 Generative Models Time Series Flow Matching 查看论文详情

论文ICLR 2026 Poster2026 年trustworthy medical AI

随机锚点与低秩去相关学习：类增量医学图像分类的极简流程

ICLR 2026 Poster accepted paper at ICLR 2026. Class-incremental learning (CIL) in medical image-guided diagnosis requires models to preserve knowledge of historical disease classes while adapting to emerging categories. Pre-trained models (PTMs) with well-generalized features provide a strong foundation, yet most PTM-based CIL strategies, such as prompt tuning, task-specific adapters and model mixtures, rely on increasingly complex designs. While effective in general-domain benchmarks, these methods falter in medical imaging, where low intra-class variability and high inter-domain shifts (from scanners, protocols and institutions) make CIL particularly prone to representation collapse and domain misalignment. Under such conditions, we find that lightweight representation calibration strategies, often dismissed in general-domain CIL for their modest gains, can be remarkably effective for adapting PTMs in medical settings.

医学影像计算 EHR 与临床预测可信、安全、公平与隐私论文 Medical Image Classification Feature Calibration 查看论文详情

论文ICLR 2026 Poster2026 年医学影像

Mini Experts 混合：突破多实例学习中的线性层瓶颈

ICLR 2026 Poster accepted paper at ICLR 2026. Multiple Instance Learning (MIL) is the predominant framework for classifying gigapixel whole-slide images in computational pathology. MIL follows a sequence of 1) extracting patch features, 2) applying a linear layer to obtain task-specific patch features, and 3) aggregating the patches into a slide feature for classification. While substantial efforts have been devoted to optimizing patch feature extraction and aggregation, none have yet addressed the second point, the critical layer which transforms general-purpose features into task-specific features. We hypothesize that this layer constitutes an overlooked performance bottleneck and that stronger representations can be achieved with a low-rank transformation tailored to each patch's phenotype, yielding synergistic effects with any of the existing MIL approaches.

医学影像计算论文 Mixture of Experts Multiple Instance Learning Computational Pathology Computer Vision 查看论文详情

论文ICLR 2026 Poster2026 年trustworthy medical AI

GARLIC：ICU 多变量时间序列的图注意力关系学习

ICLR 2026 Poster accepted paper at ICLR 2026. Healthcare data, such as Intensive Care Unit (ICU) records, comprise heterogeneous multivariate time series sampled at irregular intervals with pervasive missingness. However, clinical applications demand predictive models that are both accurate and interpretable. We present our Graph Attention-based Relational Learning for Intensive Care (GARLIC) model, a novel neural network architecture that imputes missing data through a learnable exponential-decay encoder, captures inter-sensor dependencies via time-lagged summary graphs, and fuses global patterns with cross-dimensional sequential attention. All attention weights and graph edges are learned end-to-end to serve as built-in observation-, signal-, and edge-level explanations.

医学影像计算 EHR 与临床预测可信、安全、公平与隐私论文 irregular multivariate time series graph neural network 查看论文详情

论文ICLR 2026 Poster2026 年clinical prediction

重用基础模型实现可泛化医学时间序列分类

ICLR 2026 Poster accepted paper at ICLR 2026. Medical time series (MedTS) classification suffers from poor generalizability in real-world deployment due to inter- and intra-dataset heterogeneity, such as varying numbers of channels, signal lengths, task definitions, and patient characteristics. % implicit patient characteristics, variable channel configurations, time series lengths, and diagnostic tasks. To address this, we propose FORMED, a novel framework for repurposing a backbone foundation model, pre-trained on generic time series, to enable highly generalizable MedTS classification on unseen datasets. FORMED combines the backbone with a novel classifier comprising two components: (1) task-specific channel embeddings and label queries, dynamically sized to match any number of channels and target classes, and (2) a shared decoding attention layer, jointly trained across datasets to capture medical domain knowledge through task-agnostic feature-query interactions.

医学影像计算 EHR 与临床预测论文 Medical Time Seris 分类 Time Series Foundation Model 查看论文详情

论文ICLR 2026 Poster2026 年clinical prediction

CerebraGloss：面向细粒度临床 EEG 解读的大型视觉语言模型指令微调

ICLR 2026 Poster accepted paper at ICLR 2026. Interpreting clinical electroencephalography (EEG) is a laborious, subjective process, and existing computational models are limited to narrow classification tasks rather than holistic interpretation. A key bottleneck for applying powerful Large Vision-Language Models (LVLMs) to this domain is the scarcity of datasets pairing EEG visualizations with fine-grained, expert-level annotations. We address this by introducing CerebraGloss, an instruction-tuned LVLM for nuanced EEG interpretation. We first introduce a novel, automated data generation pipeline, featuring a bespoke YOLO-based waveform detector, to programmatically create a large-scale corpus of EEG-text instruction data. Code/project link: https://github.com/iewug/CerebraGloss

医学影像计算医疗多模态临床语言智能论文 large vision-language model instruction-tuning 查看论文详情

论文ICLR 2026 Poster2026 年医学影像

MedGMAE：面向医学体数据表征学习的 Gaussian 掩码自编码器

ICLR 2026 Poster accepted paper at ICLR 2026. Self-supervised pre-training has emerged as a critical paradigm for learning transferable representations from unlabeled medical volumetric data. Masked autoencoder based methods have garnered significant attention, yet their application to volumetric medical image faces fundamental limitations from the discrete voxel-level reconstruction objective, which neglects comprehensive anatomical structure continuity. To address this challenge, We propose MedGMAE, a novel framework that replaces traditional voxel reconstruction with 3D Gaussian primitives reconstruction as new perspectives on representation learning. Our approach learns to predict complete sets of 3D Gaussian parameters as semantic abstractions to represent the entire 3D volume, from sparse visible image patches. Code/project link: https://github.com/windrise/MedGMAE; https://anonymous.4open.science/r/MedGMAE-EC8F/

医学影像计算论文 3D Gaussian Representation Medical Imaging analysis Volumetric Representation Learning ICLR 2026 查看论文详情

论文ICLR 2026 Poster2026 年trustworthy medical AI

ECG 基础模型基准：跨临床任务的现实检验

ICLR 2026 Poster accepted paper at ICLR 2026. The 12-lead electrocardiogram (ECG) is a long-standing diagnostic tool. Yet machine learning for ECG interpretation remains fragmented, often limited to narrow tasks or datasets. FMs promise broader adaptability, but fundamental questions remain: Which architectures generalize best? How do models scale with limited labels? What explains performance differences across model families? We benchmarked eight ECG FMs on 26 clinically relevant tasks using 12 public datasets comprising 1,650 regression and classification targets. Models were evaluated under fine-tuning and frozen settings, with scaling analyses across dataset sizes.

医学影像计算 EHR 与临床预测可信、安全、公平与隐私论文 ECG 心电基础模型查看论文详情

论文ICLR 2026 Poster2026 年trustworthy medical AI

多中心队列中有创机械通气需求预测的自适应测试时训练

ICLR 2026 Poster accepted paper at ICLR 2026. Accurate prediction of the need for invasive mechanical ventilation (IMV) in intensive care units (ICUs) patients is crucial for timely interventions and resource allocation. However, variability in patient populations, clinical practices, and electronic health record (EHR) systems across institutions introduces domain shifts that degrade the generalization performance of predictive models during deployment. Test-Time Training (TTT) has emerged as a promising approach to mitigate such shifts by adapting models dynamically during inference without requiring labeled target-domain data. In this work, we introduce Adaptive Test-Time Training (AdaTTT), an enhanced TTT framework tailored for EHR-based IMV prediction in ICU settings.

医学影像计算临床语言智能 EHR 与临床预测论文 Test-Time Training Domain Adaptation 查看论文详情

论文ICLR 2026 Poster2026 年trustworthy medical AI

AttTok：将属性 token 与生成式预训练视觉语言模型结合用于医学图像理解

ICLR 2026 Poster accepted paper at ICLR 2026. Recent generative pre-trained vision–language (GPTv) models have achieved remarkable success in multi-modal understanding, inspiring their adaptation to medical imaging tasks such as disease diagnosis and visual question answering (VQA). However, current instruction-tuned GPTv models suffer from two key challenges: (1) medical attributes (e.g., disease names, severity grades) are encoded as plain text tokens, collapsing semantically distinct concepts into nearly identical textual sequences; and (2) inadequate textual supervision weakens visual representation learning, leading to severe inter-attribute confusion and misaligned vision–language embeddings. To address these limitations, we introduce attribute tokens (AttTok), a set of pre‑defined special tokens that uniquely encode clinical attributes (e.g., imaging modality, diagnosis, severity) within a structured token space. Complemented by attribute‑centric embedding books, AttTok serves as anchor points for aligning both visual and textual modalities into a shared, discriminative representation space.

医学影像计算医疗多模态临床语言智能论文 Medical generative pre-trained models medical Multi-Modal alignment 查看论文详情

论文ICLR 2026 Poster2026 年trustworthy medical AI

能否用 LLM 为临床时间序列数据生成可迁移表征？

ICLR 2026 Poster accepted paper at ICLR 2026. Deploying clinical ML is slow and brittle: models that work at one hospital often degrade under distribution shifts at the next. In this work, we study a simple question -- can large language models (LLMs) create portable patient embeddings i.e. representations of patients enable a downstream predictor built on one hospital to be used elsewhere with minimal-to-no retraining and fine-tuning. To do so, we map from irregular ICU time series onto concise natural language summaries using a frozen LLM, then embed each summary with a frozen text embedding model to obtain a fixed length vector capable of serving as input to a variety of downstream predictors.

医学影像计算临床语言智能 EHR 与临床预测论文 Machine Learning for Healthcare ICU Time-series 查看论文详情

论文ICLR 2026 Poster2026 年clinical prediction

能否用 LLM 为临床时间序列数据生成可迁移表征？

ICLR 2026 Poster accepted paper at ICLR 2026. Recent advances in vision-language models (VLMs) have achieved remarkable performance on standard medical benchmarks, yet their true clinical reasoning ability remains unclear. Existing datasets predominantly emphasize classification accuracy, creating an evaluation illusion in which models appear proficient while still failing at high-stakes diagnostic reasoning. We introduce Neural-MedBench, a compact yet reasoning-intensive benchmark specifically designed to probe the limits of multimodal clinical reasoning in neurology. Neural-MedBench integrates multi-sequence MRI scans, structured electronic health records, and clinical notes, and encompasses three core task families: differential diagnosis, lesion recognition, and rationale generation. Code/project link: https://neuromedbench.github.io/

医学影像计算医疗多模态临床语言智能论文 vision-language models benchmark dataset 查看论文详情

论文ICLR 2026 Poster2026 年医学影像

面向医学超声的解剖感知表征学习

ICLR 2026 Poster accepted paper at ICLR 2026. Diagnostic accuracy of ultrasound imaging is limited by qualitative variability and its reliance on the expertise of medical professionals. Such challenges increase demand for computer-aided diagnostic systems that enhance diagnostic accuracy and efficiency. However, the unique texture and structural attributes of ultrasound images, and the scarcity of large-scale ultrasound datasets hinder the effective application of conventional machine learning methodologies. To address the challenges, we propose Anatomy-aware Representation Learning (ARL), a novel self-supervised representation learning framework specifically designed for medical ultrasound imaging.

医学影像计算论文 Foundation model medical ultrasound representation learning ICLR 2026 查看论文详情

论文ICLR 2026 Poster2026 年clinical prediction

通过概念型多模态协同适配桥接放射学与病理学基础模型

ICLR 2026 Poster accepted paper at ICLR 2026. Pretrained medical foundation models (FMs) have shown strong generalization across diverse imaging tasks, such as disease classification in radiology and tumor grading in histopathology. While recent advances in parameter-efficient finetuning have enabled effective adaptation of FMs to downstream tasks, these approaches are typically designed for a single modality. In contrast, many clinical workflows rely on joint diagnosis from heterogeneous domains, such as radiology and pathology, where fully leveraging the representation capacity of multiple FMs remains an open challenge. To address this gap, we propose Concept Tuning and Fusing (CTF), a parameter-efficient framework that uses clinically grounded concepts as a shared semantic interface to enable cross-modal co-adaptation before fusion. Code/project link: https://github.com/HKU-MedAI/CTF; https://github.com/neuronflow/BraTS-Toolkit

医学影像计算医疗多模态 EHR 与临床预测论文 multimodal learning concept-based learning 查看论文详情

论文ICLR 2026 Poster2026 年Medical multimodal AI

AttTok：将属性 token 与生成式预训练视觉语言模型结合用于医学图像理解

ICLR 2026 poster introducing AttTok, a medical vision-language method that uses predefined attribute tokens and attribute-centric mechanisms to improve medical image understanding, including classification and visual question answering.

医学影像计算医疗多模态临床语言智能论文 ICLR 2026 medical generative pre-trained models 查看论文详情

论文ICLR 2026 Poster2026 年EHR 与临床预测

重用基础模型实现可泛化医学时间序列分类

FORMED 将通用时间序列基础模型重用于医学时间序列分类，并通过任务相关通道嵌入、标签查询和共享解码注意力层，在不同医学时间序列数据集上进行轻量适配。

EHR 与临床预测医疗 AI 论文会议论文查看论文详情

数据资源retinal fundus photographs with glaucoma and structure annotationsophthalmology fundus image challenge datasetREFUGE challenge dataset; official splits described on Grand Challenge申请访问

REFUGE 视网膜眼底青光眼挑战数据集

REFUGE is a retinal fundus imaging challenge dataset for glaucoma assessment. It supports glaucoma classification, optic disc and cup segmentation, fovea localization, and fair comparison of ophthalmology AI methods on color fundus photographs.

医学影像计算数据集 ophthalmology fundus glaucoma 分割查看数据资源

数据资源chest radiographs with pneumonia/lung opacity annotationschest X-ray pneumonia detection challenge datasetRSNA 2018 AI image challenge dataset开放访问

RSNA 肺炎检测挑战数据集

The RSNA Pneumonia Detection Challenge dataset is a chest radiograph benchmark for detecting pneumonia-related lung opacities. It supports object detection, chest X-ray classification, localization, and radiology AI evaluation under a competition framework.

医学影像计算数据集 CXR pneumonia object detection RSNA 查看数据资源

数据资源upper extremity radiographs with abnormality labelsmusculoskeletal X-ray datasetLarge Stanford musculoskeletal radiograph dataset申请访问

MURA 肌骨 X 光数据集

MURA is a musculoskeletal radiograph dataset from Stanford for abnormality detection in upper extremity X-rays. It is used for radiology classification, fracture-related screening, musculoskeletal imaging AI, and human-AI comparison studies.

医学影像计算数据集 X-ray musculoskeletal abnormality detection Stanford AIMI 查看数据资源

数据资源brain MRI with demographic and clinical variablesbrain MRI and neuroimaging dataset collectionOASIS cross-sectional and longitudinal releases; see official site开放访问

OASIS 脑 MRI 与神经影像数据集

OASIS provides open-access neuroimaging datasets for studying normal aging, dementia, and brain structure. It is useful for brain MRI segmentation, age prediction, dementia classification, longitudinal modeling, and neuroimaging method benchmarking.

医学影像计算 EHR 与临床预测数据集 brain MRI dementia aging 查看数据资源

数据资源histopathology whole-slide imagesdigital pathology whole-slide image datasetCAMELYON17 challenge dataset; see Grand Challenge page申请访问

CAMELYON17 组织病理淋巴结转移数据集

CAMELYON17 is a digital pathology dataset for detecting breast cancer metastases in lymph node whole-slide images across multiple centers. It supports pathology classification, metastasis detection, weakly supervised learning, and domain generalization in histopathology AI.

医学影像计算数据集 pathology whole-slide imaging breast cancer domain generalization 查看数据资源

数据资源dermoscopic and clinical skin lesion imagesdermatology image archiveLarge public ISIC dermatology image archive开放访问

ISIC Archive 皮肤病学图像数据集

The ISIC Archive is a large public dermatology image repository for skin lesion analysis. It is widely used for melanoma classification, lesion segmentation, dermoscopic image retrieval, bias and domain shift analysis, and clinical imaging benchmark development.

医学影像计算数据集皮肤病学 skin lesion melanoma ISIC 查看数据资源

数据资源2D and 3D biomedical imagesstandardized biomedical image benchmark12 2D datasets and 6 3D datasets in MedMNIST v2开放访问

MedMNIST v2 生物医学图像基准

MedMNIST v2 is a standardized collection of lightweight biomedical image classification datasets, including 2D and 3D tasks. It is useful for quick benchmarking, AutoML, foundation model sanity checks, and reproducible evaluation across multiple medical imaging domains.

医学影像计算数据集 MedMNIST 分类 benchmark 2D 3D imaging 查看数据资源

数据资源chest radiographs with radiologist annotationschest X-ray detection and classification datasetVinDr-CXR release on PhysioNet; version 1.0.0开放访问

VinDr-CXR：越南胸部 X 光数据集

VinDr-CXR is a chest X-ray dataset with radiologist annotations from Vietnamese hospitals. It supports abnormality classification, lesion localization, radiology object detection, and robustness studies across clinical sites and populations.

医学影像计算数据集 CXR radiologist labels object detection Vietnam 查看数据资源

数据资源frontal chest radiographs with image-level labelschest X-ray classification datasetNIH public ChestX-ray14 release开放访问

NIH ChestX-ray14 数据集

NIH ChestX-ray14 is a public chest radiograph dataset with image-level labels for thoracic disease findings mined from reports. It is commonly used for chest X-ray classification, weak supervision, thoracic disease detection, and radiology benchmark comparisons.

医学影像计算数据集 CXR thoracic disease weak labels NIH 查看数据资源

数据资源chest radiographs with multi-label findingschest X-ray classification datasetLarge-scale Stanford chest X-ray dataset申请访问

CheXpert 胸部 X 光数据集

CheXpert is a large chest radiograph dataset from Stanford with uncertainty-aware labels for common chest X-ray findings. It is widely used for radiology classification, label uncertainty modeling, chest X-ray representation learning, and clinical imaging benchmarks.

医学影像计算数据集 CXR 放射影像分类 Stanford AIMI 查看数据资源

数据资源EEG and polysomnography biosignalssleep physiology signal datasetExpanded Sleep-EDF PhysioNet dataset; version 1.0.0开放访问

Sleep-EDF Expanded 多导睡眠图数据集

Sleep-EDF Expanded contains polysomnographic sleep recordings with EEG and related physiological signals. It is used for sleep stage classification, biosignal time-series modeling, self-supervised learning on physiological signals, and clinical sleep research benchmarks.

EHR 与临床预测数据集 sleep staging EEG biosignal PhysioNet 查看数据资源

数据资源12-lead ECG waveforms with diagnostic labelsECG waveform benchmarkLarge public ECG dataset; version 1.0.3开放访问

PTB-XL：大型开放 12 导联 ECG 数据集

PTB-XL is a large public 12-lead electrocardiography dataset with diagnostic statements and waveform records. It is a standard benchmark for ECG classification, cardiac abnormality detection, clinical signal representation learning, and robust evaluation of biosignal models.

EHR 与临床预测数据集 ECG 心电 cardiology biosignal 分类查看数据资源

数据资源12-lead ECG waveforms and diagnostic metadataECG waveform datasetLarge-scale diagnostic ECG dataset; version 1.0申请访问

MIMIC-IV-ECG 诊断心电图数据集

MIMIC-IV-ECG is a large deidentified electrocardiogram dataset linked to the MIMIC-IV clinical data ecosystem. It supports ECG classification, arrhythmia detection, representation learning, and multimodal modeling with structured EHR context.

EHR 与临床预测数据集 ECG 心电 biosignal clinical prediction PhysioNet 查看数据资源

数据资源chest radiographs with radiology reportschest X-ray image-report datasetLarge-scale CXR image-report dataset; version 2.1.0申请访问

MIMIC-CXR v2.1.0 胸部 X 光数据集

MIMIC-CXR is a large deidentified chest radiograph dataset with associated free-text radiology reports. It is widely used for chest X-ray classification, report generation, image-text representation learning, radiology retrieval, and medical multimodal foundation model evaluation.

医学影像计算医疗多模态临床语言智能数据集 CXR radiology reports 查看数据资源

数据资源Chinese biomedical and clinical textChinese biomedical NLP benchmark8 biomedical NLU tasks; see official repository开放访问

CBLUE：中文生物医学语言理解评测基准

CBLUE is a Chinese biomedical language understanding benchmark covering real-world biomedical NLP tasks such as named entity recognition, relation extraction, term normalization, clinical trial classification, sentence similarity, and medical question answering. It is useful for evaluating Chinese clinical NLP models and medical language models.

临床语言智能数据集 Chinese medical NLP benchmark information extraction QA 查看数据资源

数据资源胸部 X 光放射影像112,120 frontal-view X-ray images开放访问

NIH ChestX-ray14 数据集

NIH Clinical Center chest X-ray dataset released for computer-aided detection and radiology machine learning research.

放射影像 NIH 胸部 X 光查看数据资源

数据资源胸部 X 光放射影像224,316 chest radiographs申请访问

CheXpert

Stanford chest radiograph dataset for automated chest X-ray interpretation and uncertainty-aware label evaluation.

放射影像胸部 X 光斯坦福查看数据资源

数据资源ECG 心电生理信号21,837 clinical 12-lead ECG records开放访问

PTB-XL ECG 数据库 v1.0.3

Large publicly available 12-lead ECG waveform dataset with diagnostic labels, hosted on PhysioNet.

ECG 心电生理信号 PhysioNet 查看数据资源

数据资源胸部 X 光放射影像PhysioNet v2.1.0受限访问

MIMIC-CXR-JPG v2.1.0

JPG-formatted chest radiographs with labels derived from free-text reports, hosted by PhysioNet.

放射影像胸部 X 光 PhysioNet 查看数据资源

数据资源Biomedical imagesTool/modelFoundation model and code开放访问

BiomedParse 生物医学图像解析基础模型

Foundation model and toolkit for all-in-one biomedical image parsing across recognition, detection, and segmentation tasks.

biomedical image parsing 分割基础模型查看数据资源

数据资源Text and medical imagesModelMedGemma / MedSigLIP model family开放访问

MedGemma / MedSigLIP 医学 AI 模型

Google Health AI Developer Foundations open model resources for medical text and medical image understanding, including MedGemma 1.5 resources.

medical LLM medical VLM open model 查看数据资源

技术竞赛Open soonaneurysm image analysisvascular/neurovascular medical imaging开始北京时间 2026-08-14

TopAneu 2026

Grand Challenge official API lists this medical AI challenge with status OPEN_SOON. Multimodal Vessel-Specific Intracranial Aneurysm Classification and Segmentation Challenge Start date: 2026-08-14.

医学影像计算竞赛 Grand Challenge 平台 TopAneu-26 OPEN_SOON aneurysm image analysis 查看竞赛详情

技术竞赛Submission deadline 2026-08-01 19:59 BeijingEducation challengeMedical image computing education截止北京时间 2026-08-01 19:59

MICCAI 2026 医学影像计算教育挑战

MICCAI Student Board educational challenge for tutorial-style submissions around medical image computing education.

MICCAI education challenge 查看竞赛详情

技术竞赛Official phase starts 2026-05-11Prediction and signal analysisPolysomnography / clinical signals开始北京时间 2026-05-11

George B. Moody PhysioNet 2026 挑战

PhysioNet Challenge 2026 on detecting cognitive impairment from polysomnography and related clinical signals.

PhysioNet PSG cognitive impairment 查看竞赛详情