Lei Ying

3

Papers

9

Total Citations

Papers (3)

Zeroth-Order Policy Gradient for Reinforcement Learning from Human Feedback without Reward Inference

Near-Optimal Regret-Queue Length Tradeoff in Online Learning for Two-Sided Markets

NeurIPS 2025arXiv

Graph Mixup on Approximate Gromov–Wasserstein Geodesics