Poster by Gelei Deng Papers
3 papers found
Fine-Grained Verifiers: Preference Modeling as Next-token Prediction in Vision-Language Alignment
Chenhang Cui, An Zhang, Yiyang Zhou et al.
ICLR 2025poster
RSafe: Incentivizing proactive reasoning to build robust and adaptive LLM safeguards
jingnan zheng, Xiangtian Ji, Yijun Lu et al.
NeurIPS 2025poster
9
citations
Safe + Safe = Unsafe? Exploring How Safe Images Can Be Exploited to Jailbreak Large Vision-Language Models
Chenhang Cui, Gelei Deng, An Zhang et al.
NeurIPS 2025poster