AI4Meder
返回论文列表
论文ICLR 2026 Poster2026 年clinical prediction

Pixel-Level Residual Diffusion Transformer:可扩展 3D CT 体数据生成

ICLR 2026 Poster accepted paper at ICLR 2026. Generating high-resolution 3D CT volumes with fine details remains challenging due to substantial computational demands and optimization difficulties inherent to existing generative models. In this paper, we propose the Pixel-Level Residual Diffusion Transformer (PRDiT), a scalable generative framework that synthesizes high-quality 3D medical volumes directly at voxel-level. PRDiT introduces a two-stage training architecture comprising 1) a local denoiser in the form of an MLP-based blind estimator operating on overlapping 3D patches to separate low-frequency structures efficiently, and 2) a global residual diffusion transformer employing memory-efficient attention to model and refine high-frequency residuals across entire volumes. This coarse-to-fine modeling strategy simplifies optimization, enhances training stability, and effectively preserves subtle structures without the limitations of an autoencoder bottleneck.

论文默认配图 - 医学影像计算

论文详情

英文标题
Pixel-Level Residual Diffusion Transformer: Scalable 3D CT Volume Generation
作者
Zhenkai Zhang, Markus Hiller, Krista A. Ehinger, Tom Drummond
期刊/会议
ICLR 2026 Poster
发表年份
2026 年
研究方向
clinical prediction