论文ICLR 2026 Poster2026 年clinical prediction

Pixel-Level Residual Diffusion Transformer：可扩展 3D CT 体数据生成

ICLR 2026 Poster accepted paper at ICLR 2026. Generating high-resolution 3D CT volumes with fine details remains challenging due to substantial computational demands and optimization difficulties inherent to existing generative models. In this paper, we propose the Pixel-Level Residual Diffusion Transformer (PRDiT), a scalable generative framework that synthesizes high-quality 3D medical volumes directly at voxel-level. PRDiT introduces a two-stage training architecture comprising 1) a local denoiser in the form of an MLP-based blind estimator operating on overlapping 3D patches to separate low-frequency structures efficiently, and 2) a global residual diffusion transformer employing memory-efficient attention to model and refine high-frequency residuals across entire volumes. This coarse-to-fine modeling strategy simplifies optimization, enhances training stability, and effectively preserves subtle structures without the limitations of an autoencoder bottleneck.

医学影像计算 EHR 与临床预测论文 Medical Imaging 3D Diffusion Model Diffusion Transformer CT Scan Medical Image Generation ICLR 2026 ICLR 2026 Poster

论文详情

英文标题: Pixel-Level Residual Diffusion Transformer: Scalable 3D CT Volume Generation
作者: Zhenkai Zhang, Markus Hiller, Krista A. Ehinger, Tom Drummond
期刊/会议: ICLR 2026 Poster
发表年份: 2026 年
研究方向: clinical prediction