Nicolas Flammarion
7
Papers
422
Total Citations
Papers (7)
Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks
ICLR 2025
375
citations
Is In-Context Learning Sufficient for Instruction Following in LLMs?
ICLR 2025arXiv
21
citations
OS-Harm: A Benchmark for Measuring Safety of Computer Use Agents
NeurIPS 2025arXiv
18
citations
Selective induction Heads: How Transformers Select Causal Structures in Context
ICLR 2025arXiv
4
citations
Learning In-context $n$-grams with Transformers: Sub-$n$-grams Are Near-Stationary Points
ICML 2025
3
citations
Long-Context Linear System Identification
ICLR 2025arXiv
1
citations
Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for Instruction Fine-Tuning
ICML 2024
0
citations