Qi Zhao

41
Papers
93
Total Citations

Papers (41)

AntGPT: Can Large Language Models Help Long-term Action Anticipation from Videos?

ICLR 2024
81
citations

GazeXplain: Learning to Predict Natural Language Explanations of Visual Scanpaths

ECCV 2024
8
citations

Mastering Massive Multi-Task Reinforcement Learning via Mixture-of-Expert Decision Transformer

ICML 2025
4
citations

Model Lineage Closeness Analysis

AAAI 2025
0
citations

Explainable Saliency: Articulating Reasoning with Contextual Prioritization

CVPR 2025
0
citations

SwitchTab: Switched Autoencoders Are Effective Tabular Learners

AAAI 2024
0
citations

PNeRV: Enhancing Spatial Consistency via Pyramidal Neural Representation for Videos

CVPR 2024
0
citations

Beyond Average: Individualized Visual Scanpath Prediction

CVPR 2024
0
citations

ROME is Forged in Adversity: Robust Distilled Datasets via Information Bottleneck

ICML 2025
0
citations

SALICON: Saliency in Context

CVPR 2015
0
citations

Label Consistent Quadratic Surrogate Model for Visual Saliency Prediction

CVPR 2015
0
citations

A Paradigm for Building Generalized Models of Human Image Perception Through Data Fusion

CVPR 2016
0
citations

Deep Future Gaze: Gaze Anticipation on Egocentric Videos Using Adversarial Networks

CVPR 2017
0
citations

Emotional Attention: A Study of Image Sentiment and Visual Attention

CVPR 2018
0
citations

Learning to Detect Human-Object Interactions With Knowledge

CVPR 2019
0
citations

Learning to Learn From Noisy Labeled Data

CVPR 2019
0
citations

Fantastic Answers and Where to Find Them: Immersive Question-Directed Visual Attention

CVPR 2020
0
citations

Predicting Human Scanpaths in Visual Question Answering

CVPR 2021
0
citations

Explicit Knowledge Incorporation for Visual Reasoning

CVPR 2021
0
citations

REX: Reasoning-Aware and Grounded Explanation

CVPR 2022arXiv
0
citations

Query and Attention Augmentation for Knowledge-Based Explainable Reasoning

CVPR 2022
0
citations

VisualHow: Multimodal Problem Solving

CVPR 2022
0
citations

Divide and Conquer: Answering Questions With Object Factorization and Compositional Reasoning

CVPR 2023arXiv
0
citations

DNeRV: Modeling Inherent Dynamics via Difference Neural Representation for Videos

CVPR 2023arXiv
0
citations

SALICON: Reducing the Semantic Gap in Saliency Prediction by Adapting Deep Neural Networks

ICCV 2015
0
citations

Dual-Glance Model for Deciphering Social Relationships

ICCV 2017arXiv
0
citations

Learning Visual Attention to Identify People With Autism Spectrum Disorder

ICCV 2017
0
citations

Attention-Based Autism Spectrum Disorder Screening With Privileged Modality

ICCV 2019
0
citations

Toward Multi-Granularity Decision-Making: Explicit Visual Reasoning with Hierarchical Knowledge

ICCV 2023
0
citations

AiR: Attention with Reasoning Capability

ECCV 2020
0
citations

n-Reference Transfer Learning for Saliency Prediction

ECCV 2020
0
citations

New Datasets and Models for Contextual Reasoning in Visual Dialog

ECCV 2022
0
citations

Two Sides of the Same Coin: Learning the Backdoor to Remove the Backdoor

AAAI 2025
0
citations

Synthetic Video Enhances Physical Fidelity in Video Synthesis

ICCV 2025
0
citations

CameraCtrl II: Dynamic Scene Exploration via Camera-controlled Video Diffusion Models

ICCV 2025
0
citations

Unsupervised Learning of View-invariant Action Representations

NeurIPS 2018
0
citations

Learning metrics for persistence-based summaries and applications for graph classification

NeurIPS 2019
0
citations

Learning to Predict Trustworthiness with Steep Slope Loss

NeurIPS 2021
0
citations

NN-Baker: A Neural-network Infused Algorithmic Framework for Optimization Problems on Geometric Intersection Graphs

NeurIPS 2021
0
citations

What Do Deep Saliency Models Learn about Visual Attention?

NeurIPS 2023
0
citations

Safe Subspace Screening for Nuclear Norm Regularized Least Squares Problems

ICML 2015
0
citations