Poster by Alexander Panfilov Papers
2 papers found
An Interpretable N-gram Perplexity Threat Model for Large Language Model Jailbreaks
Valentyn Boreiko, Alexander Panfilov, Václav Voráček et al.
ICML 2025posterarXiv:2410.16222
Provable Compositional Generalization for Object-Centric Learning
Thaddäus Wiedemer, Jack Brady, Alexander Panfilov et al.
ICLR 2024posterarXiv:2310.05327