De-An Huang
23
Papers
519
Total Citations
Papers (23)
Eureka: Human-Level Reward Design via Coding Large Language Models
ICLR 2024
471
citations
PerAda: Parameter-Efficient Federated Learning Personalization with Generalization Guarantees
CVPR 2024
20
citations
Argus: Vision-Centric Reasoning with Grounded Chain-of-Thought
CVPR 2025arXiv
19
citations
Omni-RGPT: Unifying Image and Video Region-level Understanding via Token Marks
CVPR 2025arXiv
9
citations
Unsupervised Learning of Long-Term Motion Dynamics for Videos
CVPR 2017arXiv
0
citations
Finding "It": Weakly-Supervised Reference-Aware Visual Grounding in Instructional Videos
CVPR 2018
0
citations
What Makes a Video a Video: Analyzing Temporal Information in Video Understanding Models and Datasets
CVPR 2018
0
citations
D3TW: Discriminative Differentiable Dynamic Time Warping for Weakly Supervised Action Alignment and Segmentation
CVPR 2019
0
citations
Neural Task Graphs: Generalizing to Unseen Tasks From a Single Video Demonstration
CVPR 2019
0
citations
Spatio-Temporal Graph for Video Captioning With Knowledge Distillation
CVPR 2020arXiv
0
citations
Visual Forecasting by Imitating Dynamics in Natural Sequences
ICCV 2017arXiv
0
citations
Imitation Learning for Human Pose Prediction
ICCV 2019
0
citations
Procedure Planning in Instructional Videos
ECCV 2020
0
citations
NVILA: Efficient Frontier Visual Language Models
CVPR 2025
0
citations
How Do We Use Our Hands? Discovering a Diverse Set of Common Grasps
CVPR 2015
0
citations
Forecasting Interactive Dynamics of Pedestrians With Fictitious Play
CVPR 2017arXiv
0
citations
Unsupervised Visual-Linguistic Reference Resolution in Instructional Videos
CVPR 2017arXiv
0
citations
Learning to Decompose and Disentangle Representations for Video Prediction
NeurIPS 2018
0
citations
Regression Planning Networks
NeurIPS 2019
0
citations
Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language Models
NeurIPS 2022
0
citations
MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge
NeurIPS 2022
0
citations
Pre-Trained Language Models for Interactive Decision-Making
NeurIPS 2022
0
citations
MinVIS: A Minimal Video Instance Segmentation Framework without Video-based Training
NeurIPS 2022
0
citations