Victor Veitch
4
Papers
1
Total Citations
Papers (4)
RATE: Causal Explainability of Reward Models with Imperfect Counterfactuals
ICML 2025
1
citations
On the Origins of Linear Representations in Large Language Models
ICML 2024
0
citations
The Linear Representation Hypothesis and the Geometry of Large Language Models
ICML 2024
0
citations
Transforming and Combining Rewards for Aligning Large Language Models
ICML 2024
0
citations