Ruoxi Jia
10
Papers
20
Total Citations
Papers (10)
LLMs Can Plan Only If We Tell Them
ICLR 2025
16
citations
Detecting Adversarial Data Using Perturbation Forgery
CVPR 2025arXiv
2
citations
Just Enough Shifts: Mitigating Over-Refusal in Aligned Language Models with Targeted Representation Fine-Tuning
ICML 2025
2
citations
Efficient Input-level Backdoor Defense on Text-to-Image Synthesis via Neuron Activation Variation
ICCV 2025
0
citations
The Mirrored Influence Hypothesis: Efficient Data Influence Estimation by Harnessing Forward Passes
CVPR 2024
0
citations
Position: A Safe Harbor for AI Evaluation and Red Teaming
ICML 2024
0
citations
RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content
ICML 2024
0
citations
Algorithm of Thoughts: Enhancing Exploration of Ideas in Large Language Models
ICML 2024
0
citations
Rethinking Data Shapley for Data Selection Tasks: Misleads and Merits
ICML 2024
0
citations
Probing Hidden Knowledge Holes in Unlearned LLMs
NeurIPS 2025
0
citations