AI4Meder

AI4Meder 站内搜索

搜索医学 AI 论文与资源

按论文、数据资源、技术竞赛、投稿截止日期和课程资源检索社区内容,快速进入对应详情页。

7 条结果

输入关键词或点击标签,按论文、数据资源、竞赛截止日期、征稿与课程缩小范围。 标签:video

清空筛选
论文ICLR 2026 Poster2026 年clinical prediction

视频理解中的人脑:动态专家混合模型

ICLR 2026 Poster accepted paper at ICLR 2026. The human brain is the most efficient and versatile system for processing dynamic visual input. By comparing representations from deep video models to brain activity, we can gain insights into mechanistic solutions for effective video processing, important to better understand the brain and to build better models. Current works in model-brain alignment primarily focus on fMRI measurements, leaving open questions about fine-grained dynamic processing. Here, we introduce the first large-scale model benchmarking on alignment to dynamic electroencephalography (EEG) recordings of short natural videos. We analyze 100+ models across the axes of temporal integration, classification task, architecture, and pretraining, using our proposed Cross-Temporal Representational Similarity Analysis (CT-RSA) which matches the best time-unfolded model features to dynamically evolving brain responses, distilling $10^7$ alignment scores.

论文ICLR 2026 Poster2026 年医学影像

受认知过程启发的主体无关脑视觉解码架构

ICLR 2026 Poster accepted paper at ICLR 2026. Subject-agnostic brain decoding, which aims to reconstruct continuous visual experiences from fMRI without subject-specific training, holds great potential for clinical applications. However, this direction remains underexplored due to challenges in cross-subject generalization and the complex nature of brain signals. In this work, we propose Visual Cortex Flow Architecture (VCFlow), a novel hierarchical decoding framework that explicitly models the ventral-dorsal architecture of the human visual system to learn multi-dimensional representations. By disentangling and leveraging features from early visual cortex, ventral, and dorsal streams, VCFlow captures diverse and complementary cognitive information essential for visual reconstruction.

论文ICLR 2026 Poster2026 年trustworthy medical AI

ProstaTD:将手术 triplet 从分类桥接到全监督检测

ICLR 2026 Poster accepted paper at ICLR 2026. Surgical triplet detection is a critical task in surgical video analysis, with significant implications for performance assessment and training novice surgeons. However, existing datasets like CholecT50 lack precise spatial bounding box annotations, rendering triplet classification at the image level insufficient for practical applications. The inclusion of bounding box annotations is essential to make this task meaningful, as they provide the spatial context necessary for accurate analysis and improved model generalizability. To address these shortcomings, we introduce ProstaTD, a large-scale, multi-institutional dataset for surgical triplet detection, developed from the technically demanding domain of robot-assisted prostatectomy.

论文ICLR 2026 Poster2026 年surgical/interventional AI

WavePolyp:基于层级小波特征聚合与帧间差异感知的视频息肉分割

ICLR 2026 Poster accepted paper at ICLR 2026. Automatic polyp segmentation from colonoscopy videos is a crucial technique that assists clinicians in improving the accuracy and efficiency of diagnosis, preventing polyps from developing into cancer. However, video polyp segmentation (VPS) is a challenging task due to (1) the significant inter-frame divergence in videos, (2) the high camouflage of polyps in normal colon structures and (3) the clinical requirement of real-time performance. In this paper, we propose a novel segmentation network, WavePolyp, which consists of two innovative components: a hierarchical wavelet-based feature aggregation (HWFA) module and inter-frame divergence perception (IDP) blocks. Specifically, HWFA excavates and amplifies discriminative information from high-frequency and low-frequency features decomposed by wavelet transform, hierarchically aggregating them into refined spatial representations within each frame. Code/project link: https://github.com/FishballZhang/WavePolyp

论文ICLR 2026 Poster2026 年surgical/interventional AI

HFSTI-Net:视频息肉分割的层级频率-空间-时间交互

ICLR 2026 Poster accepted paper at ICLR 2026. Automatic video polyp segmentation (VPS) is crucial for preventing and treating colorectal cancer by ensuring accurate identification of polyps in colonoscopy examinations. However, its clinical application is hampered by two key challenges: shape collapse, which compromises structural integrity, and episodic amnesia, which causes instability in challenging video sequences. To address these challenges, we present a novel video segmentation network, \emph{HFSTI-Net}, which integrates global perception with spatiotemporal consistency in spatial, temporal, and frequency domains. Specifically, to address shape collapse under low contrast or visual ambiguity, we design a Hierarchical Frequency-spatial Interaction (HFSI) module that fuses spatial and frequency cues for fine-grained boundary localization. Code/project link: https://github.com/Yuanqin-He/HFSTI-Net

论文npj Digital Medicine2025 年医学影像计算

基于智能手机视频深度学习准确评估帕金森病步态障碍

步态障碍是帕金森病(PD)中最常见且最具致残性的症状之一,其表现复杂且高度异质。在此,我们提出了一种基于深度学习的框架,利用智能手机录制的视频评估步态障碍。该框架在预测 PD 严重程度方面表现出色,微平均受试者工作特征曲线下面积(AUC)为 0.87,F1 分数为 0.806,与三位临床专家的平均表现相当。此外,它以 73.68%的精度有效区分了药物对步态障碍的整体疗效。特别是,它能够区分统一帕金森病评分量表(UPDRS)分辨率之外的药物诱导的细微粒度步态变化。此外,我们的可解释框架能够提取传统临床使用的运动指标,并发现对疾病进展和药物反应敏感的新数字生物标志物。 这些发现强调了其在临床和家庭环境中高效评估疾病进展的巨大潜力,以及在临床试验中评估疾病修饰效果的潜力,以促进个性化治疗。

数据资源cardiac ultrasound videos with functional annotationsechocardiography video datasetLarge echocardiography video dataset; see official site申请访问

EchoNet-Dynamic 心脏超声视频数据集

EchoNet-Dynamic is a cardiac ultrasound video dataset with expert annotations for left ventricular function. It is used for echocardiography video understanding, ejection fraction estimation, cardiac segmentation, and clinical video AI research.