De-An Huang

23
Papers
519
Total Citations

Papers (23)

Eureka: Human-Level Reward Design via Coding Large Language Models

ICLR 2024
471
citations

PerAda: Parameter-Efficient Federated Learning Personalization with Generalization Guarantees

CVPR 2024
20
citations

Argus: Vision-Centric Reasoning with Grounded Chain-of-Thought

CVPR 2025arXiv
19
citations

Omni-RGPT: Unifying Image and Video Region-level Understanding via Token Marks

CVPR 2025arXiv
9
citations

Unsupervised Learning of Long-Term Motion Dynamics for Videos

CVPR 2017arXiv
0
citations

Finding "It": Weakly-Supervised Reference-Aware Visual Grounding in Instructional Videos

CVPR 2018
0
citations

What Makes a Video a Video: Analyzing Temporal Information in Video Understanding Models and Datasets

CVPR 2018
0
citations

D3TW: Discriminative Differentiable Dynamic Time Warping for Weakly Supervised Action Alignment and Segmentation

CVPR 2019
0
citations

Neural Task Graphs: Generalizing to Unseen Tasks From a Single Video Demonstration

CVPR 2019
0
citations

Spatio-Temporal Graph for Video Captioning With Knowledge Distillation

CVPR 2020arXiv
0
citations

Visual Forecasting by Imitating Dynamics in Natural Sequences

ICCV 2017arXiv
0
citations

Imitation Learning for Human Pose Prediction

ICCV 2019
0
citations

Procedure Planning in Instructional Videos

ECCV 2020
0
citations

NVILA: Efficient Frontier Visual Language Models

CVPR 2025
0
citations

How Do We Use Our Hands? Discovering a Diverse Set of Common Grasps

CVPR 2015
0
citations

Forecasting Interactive Dynamics of Pedestrians With Fictitious Play

CVPR 2017arXiv
0
citations

Unsupervised Visual-Linguistic Reference Resolution in Instructional Videos

CVPR 2017arXiv
0
citations

Learning to Decompose and Disentangle Representations for Video Prediction

NeurIPS 2018
0
citations

Regression Planning Networks

NeurIPS 2019
0
citations

Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language Models

NeurIPS 2022
0
citations

MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge

NeurIPS 2022
0
citations

Pre-Trained Language Models for Interactive Decision-Making

NeurIPS 2022
0
citations

MinVIS: A Minimal Video Instance Segmentation Framework without Video-based Training

NeurIPS 2022
0
citations