AI4Meder
返回论文列表
论文ICLR 2026 Poster2026 年trustworthy medical AI

弥合安全缺口:视觉自回归模型中的手术概念擦除

ICLR 2026 Poster accepted paper at ICLR 2026. The rapid progress of visual autoregressive (VAR) models has brought new opportunities for text-to-image generation, but also heightened safety concerns. Existing concept erasure techniques, primarily designed for diffusion models, fail to generalize to VARs due to their next-scale token prediction paradigm. In this paper, we first propose a novel VAR Erasure framework **VARE** that enables stable concept erasure in VAR models by leveraging auxiliary visual tokens to reduce fine-tuning intensity. Building upon this, we introduce **S-VARE**, a novel and effective concept erasure method designed for VAR, which incorporates a filtered cross entropy loss to precisely identify and minimally adjust unsafe visual tokens, along with a preservation loss to maintain semantic fidelity, addressing the issues such as language drift and reduced diversity introduce by na\"ive fine-tuning.

论文默认配图 - 可信、安全、公平与隐私

论文详情

英文标题
Closing the Safety Gap: Surgical Concept Erasure in Visual Autoregressive Models
作者
Xinhao Zhong, Yimin Zhou, Zhiqi Zhang, Junhao Li, Sun Yi, Bin Chen, Shu-Tao Xia, Xuan Wang, Ke Xu
期刊/会议
ICLR 2026 Poster
发表年份
2026 年
研究方向
trustworthy medical AI