Chenjia Bai
12
Papers
68
Total Citations
Papers (12)
Online Preference Alignment for Language Models via Count-based Exploration
ICLR 2025
19
citations
Adversarial Locomotion and Motion Imitation for Humanoid Policy Learning
NeurIPS 2025arXiv
13
citations
OVD-Explorer: Optimism Should Not Be the Sole Pursuit of Exploration in Noisy Environments
AAAI 2024arXiv
13
citations
Radiology Report Generation via Multi-objective Preference Optimization
AAAI 2025
9
citations
Forward KL Regularized Preference Optimization for Aligning Diffusion Policies
AAAI 2025
7
citations
Task-Agnostic Pre-training and Task-Guided Fine-tuning for Versatile Diffusion Planner
ICML 2025
4
citations
Information-Theoretic Reward Decomposition for Generalizable RLHF
NeurIPS 2025
3
citations
Constrained Ensemble Exploration for Unsupervised Skill Discovery
ICML 2024
0
citations
Cross-Domain Policy Adaptation by Capturing Representation Mismatch
ICML 2024
0
citations
Contrastive Representation for Data Filtering in Cross-Domain Offline Reinforcement Learning
ICML 2024
0
citations
How Does Goal Relabeling Improve Sample Efficiency?
ICML 2024
0
citations
SAM-E: Leveraging Visual Foundation Model with Sequence Imitation for Embodied Manipulation
ICML 2024
0
citations