论文ICLR 2026 Poster2026 年clinical NLP LaVCa:LLM 辅助的视觉皮层图像描述
ICLR 2026 Poster accepted paper at ICLR 2026. Understanding the properties of neural populations (or voxels) in the human brain can advance our comprehension of human perceptual and cognitive processing capabilities and contribute to developing brain-inspired computer models. Recent encoding models using deep neural networks (DNNs) have successfully predicted voxel-wise activity. However, interpreting the properties that explain voxel responses remains challenging because of the black-box nature of DNNs. As a solution, we propose LLM-assisted Visual Cortex Captioning (LaVCa), a data-driven approach that leverages large language models (LLMs) to generate natural-language captions for images to which voxels are selective.
论文ICLR 2026 Poster2026 年医学影像 MnemoDyn:从 4 万条 fMRI 序列学习静息态动力学
ICLR 2026 Poster accepted paper at ICLR 2026. We present a dynamical-systems based model for resting-state functional magnetic resonance imaging (rs-fMRI), trained on a dataset of roughly $40$K rs-fMRI sequences covering a wide variety of public and available-by-permission datasets. While most existing proposals use transformer backbones, we utilize multi-resolution temporal modeling of the dynamics across parcellated brain regions. We show that MnemoDyn is compute efficient and generalizes very well across diverse populations and scanning protocols. When benchmarked against current state-of-the-art transformer-based approaches, MnemoDyn consistently delivers superior reconstruction quality.
论文ICLR 2026 Poster2026 年trustworthy medical AI Brain-Semantoks:用自蒸馏基础模型学习脑动力学语义 token
ICLR 2026 Poster accepted paper at ICLR 2026. The development of foundation models for functional magnetic resonance imaging (fMRI) time series holds significant promise for predicting phenotypes related to disease and cognition. Current models, however, are often trained using a mask-and-reconstruct objective on small brain regions. This focus on low-level information leads to representations that are sensitive to noise and temporal fluctuations, necessitating extensive fine-tuning for downstream tasks. We introduce Brain-Semantoks, a self-supervised framework designed specifically to learn abstract representations of brain dynamics. Its architecture is built on two core innovations: a semantic tokenizer that aggregates noisy regional signals into robust tokens representing functional networks, and a self-distillation objective that enforces representational stability across time.
论文ICLR 2026 Poster2026 年医学影像 脑图基础模型:跨多图谱与疾病的预训练和提示微调
ICLR 2026 Poster accepted paper at ICLR 2026. As large language models (LLMs) continue to revolutionize AI research, there is a growing interest in building large-scale brain foundation models to advance neuroscience. While most existing brain foundation models are pre-trained on time-series signals or connectome features, we propose a novel graph-based pre-training paradigm for constructing a brain graph foundation model. In this paper, we introduce the Brain Graph Foundation Model, termed BrainGFM, a unified framework that leverages graph contrastive learning and graph masked autoencoders for large-scale fMRI-based pre-training. BrainGFM is pre-trained on a diverse mixture of brain atlases with varying parcellations, significantly expanding the pre-training corpus and enhancing the model’s ability to generalize across heterogeneous fMRI-derived brain representations. Code/project link: https://github.com/weixinxu666/BrainGFM
论文ICLR 2026 Poster2026 年clinical prediction 基于脉冲的数字大脑:脑活动分析的新型基础模型
ICLR 2026 Poster accepted paper at ICLR 2026. Modeling the temporal dynamics of the human brain remains a core challenge in computational neuroscience and artificial intelligence. Traditional methods often ignore the biological spike characteristics of brain activity and find it difficult to reveal the dynamic dependencies and causal interactions between brain regions, limiting their effectiveness in brain function research and clinical applications. To address this issue, we propose a Spike-based Digital Brain (Spike-DB), a novel fundamental model that introduces the spike computing paradigm into brain time series modeling. Spike-DB encodes fMRI signals as spike trains and learns the temporal driving relationships between anchor and target regions to achieve high-precision prediction of brain activity and reveal underlying causal dependencies and dynamic relationship characteristics. Code/project link: https://github.com/UAIBC-Brain/Spike-DB
论文ICLR 2026 Poster2026 年医学影像 统一脑表面与脑体积配准
ICLR 2026 Poster accepted paper at ICLR 2026. Accurate registration of brain MRI scans is fundamental for cross-subject analysis in neuroscientific studies. This involves aligning both the cortical surface of the brain and the interior volume. Traditional methods treat volumetric and surface-based registration separately, which often leads to inconsistencies that limit downstream analyses. We propose a deep learning framework, UCS, that registers 3D brain MRI images by jointly aligning both cortical and subcortical regions, through a unified volume-and-surface-based representation. Our approach leverages an intermediate spherical coordinate space to bridge anatomical surface topology with volumetric anatomy, enabling consistent and anatomically accurate alignment.
论文ICLR 2026 Poster2026 年医学影像 超越网格锁定体素:连续脑编码的神经响应函数
ICLR 2026 Poster accepted paper at ICLR 2026. Neural encoding models aim to predict fMRI-measured brain responses to natural images. fMRI data is acquired as a 3D volume of voxels, where each voxel has a defined spatial location in the brain. However, conventional encoding models often flatten this volume into a 1D vector and treat voxel responses as independent outputs. This removes spatial context, discards anatomical information, and ties each model to a subject-specific voxel grid. We introduce the NRF Neural Response Function, a framework that models fMRI activity as a continuous function over anatomical space rather than a flat vector of voxels. NRF represents brain activity as a continuous implicit function: given an image and a spatial coordinate (x, y, z) in standardized MNI space, the model predicts the response at that location.