论文ICLR 2026 Poster2026 年clinical NLP 用于胸部 X 光图像的结构化、标注式、定位化 VQA 数据集:含完整句答案与场景图
ICLR 2026 Poster accepted paper at ICLR 2026. Visual Question Answering (VQA) enables targeted and context-dependent analysis of medical images, such as chest X-rays (CXRs). However, existing VQA datasets for CXRs are typically constrained by simplistic and brief answer formats, lacking localization annotations (e.g., bounding boxes) and structured tags (e.g., region or radiological finding/disease tags). To address these limitations, we introduce MIMIC-Ext-CXR-QBA (abbr. CXR-QBA), a large-scale CXR VQA dataset derived from MIMIC-CXR, comprising 42 million QA-pairs with multi-granular, multi-part answers, detailed bounding boxes, and structured tags. Code/project link: https://github.com/philip-mueller/mimic-ext-cxr-qba/
论文ICLR 2026 Poster2026 年clinical NLP 重新思考放射报告生成:从叙事流到主题引导 findings
ICLR 2026 Poster accepted paper at ICLR 2026. Vision-Language Models (VLMs) for radiology report generation are typically trained to mimic the narrative flow of human experts. However, we identify a potential limitation in this conventional paradigm. We hypothesize that optimizing for narrative coherence encourages models to rely on linguistic priors and inter-sentence correlations, which can weaken their grounding in direct visual evidence and lead to factual inaccuracies. To investigate this, we design a controlled experiment demonstrating that as textual context increases, a model's reliance on the input image systematically decays. We propose LLaVA-TA (Topic-guided and Anatomy-aware), a new fine-tuning framework that directly addresses this challenge by re-engineering the generation process.
论文ICLR 2026 Poster2026 年medical LLM agent K-Prism:知识引导与提示融合的通用医学图像分割模型
ICLR 2026 Poster accepted paper at ICLR 2026. Medical image segmentation is fundamental to clinical decision-making, yet existing models remain fragmented. They are usually trained on single knowledge sources and specific to individual tasks, modalities, or organs. This fragmentation contrasts sharply with clinical practice, where experts seamlessly integrate diverse knowledge: anatomical priors from training, exemplar-based reasoning from reference cases, and iterative refinement through real-time interaction. We present $\textbf{K-Prism}$, a unified segmentation framework that mirrors this clinical flexibility by systematically integrating three knowledge paradigms: (i) $\textit{semantic priors}$ learned from annotated datasets, (ii) $\textit{in-context knowledge}$ from few-shot reference examples, and (iii) $\textit{interactive feedback}$ from user inputs like clicks or scribbles. Code/project link: https://github.com/bangwayne/K-Prism
论文ICLR 2026 Poster2026 年clinical prediction 学习自我批判机制用于区域引导胸部 X 光报告生成
ICLR 2026 Poster accepted paper at ICLR 2026. Automatic radiology reporting assists radiologists in diagnosing abnormalities in radiology images, where grounding the automatic diagnosis with abnormality locations is important for the report interpretability. However, existing supervised-learning methods could lead to learning the superficial statistical correlations between images and reports, lacking multi-faceted reasoning to critique the relevant regions on which radiologists would focus. Recently, self-critical reasoning has been investigated in test-time scaling approaches to alleviate hallucinations of LLMs with increased time complexity. In this work, we focus on chest X-ray report generation with particular focus on clinical accuracy, where self-critical reasoning is alternatively introduced into the model architecture and their training objective, preferred by the real-time automatic reporting system.
数据资源chest radiographs with pneumonia/lung opacity annotationschest X-ray pneumonia detection challenge datasetRSNA 2018 AI image challenge dataset开放访问 RSNA 肺炎检测挑战数据集
The RSNA Pneumonia Detection Challenge dataset is a chest radiograph benchmark for detecting pneumonia-related lung opacities. It supports object detection, chest X-ray classification, localization, and radiology AI evaluation under a competition framework.
数据资源upper extremity radiographs with abnormality labelsmusculoskeletal X-ray datasetLarge Stanford musculoskeletal radiograph dataset申请访问 MURA 肌骨 X 光数据集
MURA is a musculoskeletal radiograph dataset from Stanford for abnormality detection in upper extremity X-rays. It is used for radiology classification, fracture-related screening, musculoskeletal imaging AI, and human-AI comparison studies.
数据资源chest radiographs with radiologist annotationschest X-ray detection and classification datasetVinDr-CXR release on PhysioNet; version 1.0.0开放访问 VinDr-CXR:越南胸部 X 光数据集
VinDr-CXR is a chest X-ray dataset with radiologist annotations from Vietnamese hospitals. It supports abnormality classification, lesion localization, radiology object detection, and robustness studies across clinical sites and populations.
数据资源frontal chest radiographs with image-level labelschest X-ray classification datasetNIH public ChestX-ray14 release开放访问 NIH ChestX-ray14 数据集
NIH ChestX-ray14 is a public chest radiograph dataset with image-level labels for thoracic disease findings mined from reports. It is commonly used for chest X-ray classification, weak supervision, thoracic disease detection, and radiology benchmark comparisons.
数据资源chest radiographs with multi-label findingschest X-ray classification datasetLarge-scale Stanford chest X-ray dataset申请访问 CheXpert 胸部 X 光数据集
CheXpert is a large chest radiograph dataset from Stanford with uncertainty-aware labels for common chest X-ray findings. It is widely used for radiology classification, label uncertainty modeling, chest X-ray representation learning, and clinical imaging benchmarks.
数据资源chest radiographs with radiology reportschest X-ray image-report datasetLarge-scale CXR image-report dataset; version 2.1.0申请访问 MIMIC-CXR v2.1.0 胸部 X 光数据集
MIMIC-CXR is a large deidentified chest radiograph dataset with associated free-text radiology reports. It is widely used for chest X-ray classification, report generation, image-text representation learning, radiology retrieval, and medical multimodal foundation model evaluation.
数据资源胸部 X 光放射影像112,120 frontal-view X-ray images开放访问 NIH ChestX-ray14 数据集
NIH Clinical Center chest X-ray dataset released for computer-aided detection and radiology machine learning research.
数据资源胸部 X 光放射影像224,316 chest radiographs申请访问 CheXpert
Stanford chest radiograph dataset for automated chest X-ray interpretation and uncertainty-aware label evaluation.
数据资源胸部 X 光放射影像PhysioNet v2.1.0受限访问 MIMIC-CXR-JPG v2.1.0
JPG-formatted chest radiographs with labels derived from free-text reports, hosted by PhysioNet.
数据资源Biomedical imagesTool/modelFoundation model and code开放访问 BiomedParse 生物医学图像解析基础模型
Foundation model and toolkit for all-in-one biomedical image parsing across recognition, detection, and segmentation tasks.
数据资源Text and medical imagesModelMedGemma / MedSigLIP model family开放访问 MedGemma / MedSigLIP 医学 AI 模型
Google Health AI Developer Foundations open model resources for medical text and medical image understanding, including MedGemma 1.5 resources.
数据资源医学影像分割基准IMed-361M / IMIS-Bench开放访问 IMed-361M / IMIS-Bench 交互式医学图像分割基准
Interactive medical image segmentation benchmark and baseline from CVPR 2025, covering multiple modalities, organs, and target structures.