Dacheng Tao

100
Papers
398
Total Citations

Papers (100)

R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization

ICCV 2025arXiv
206
citations

Divide, Conquer and Combine: A Training-Free Framework for High-Resolution Image Perception in Multimodal Large Language Models

AAAI 2025arXiv
73
citations

Revisiting Backdoor Attacks against Large Vision-Language Models from Domain Shift

CVPR 2025arXiv
25
citations

Revisiting Plasticity in Visual Reinforcement Learning: Data, Modules and Training Stages

ICLR 2024arXiv
25
citations

SimDistill: Simulated Multi-Modal Distillation for BEV 3D Object Detection

AAAI 2024arXiv
24
citations

One More Step: A Versatile Plug-and-Play Module for Rectifying Diffusion Schedule Flaws and Enhancing Low-Frequency Controls

CVPR 2024arXiv
9
citations

MMReason: An Open-Ended Multi-Modal Multi-Step Reasoning Benchmark for MLLMs Toward AGI

ICCV 2025
7
citations

Synergy of Sight and Semantics: Visual Intention Understanding with CLIP

ECCV 2024
7
citations

Network Sparsity Unlocks the Scaling Potential of Deep Reinforcement Learning

ICML 2025arXiv
6
citations

Free-Form Motion Control: Controlling the 6D Poses of Camera and Objects in Video Generation

ICCV 2025arXiv
4
citations

Mastering Massive Multi-Task Reinforcement Learning via Mixture-of-Expert Decision Transformer

ICML 2025arXiv
4
citations

Adaptive Defense against Harmful Fine-Tuning for Large Language Models via Bayesian Data Scheduler

NeurIPS 2025arXiv
2
citations

Learning system dynamics without forgetting

ICLR 2025
2
citations

ICLShield: Exploring and Mitigating In-Context Learning Backdoor Attacks

ICML 2025arXiv
2
citations

AiDE-Q: Synthetic Labeled Datasets Can Enhance Learning Models for Quantum Property Estimation

NeurIPS 2025arXiv
1
citations

LLM Data Selection and Utilization via Dynamic Bi-level Optimization

ICML 2025arXiv
1
citations

LoRA Recycle: Unlocking Tuning-Free Few-Shot Adaptability in Visual Foundation Models by Recycling Pre-Tuned LoRAs

CVPR 2025
0
citations

Neuron: Learning Context-Aware Evolving Representations for Zero-Shot Skeleton Action Recognition

CVPR 2025arXiv
0
citations

Harnessing Text-to-Image Diffusion Models for Point Cloud Self-Supervised Learning

ICCV 2025arXiv
0
citations

CopyrightShield: Enhancing Diffusion Model Security Against Copyright Infringement Attacks

ICCV 2025
0
citations

Rethink Sparse Signals for Pose-guided Text-to-image Generation

ICCV 2025arXiv
0
citations

Analytic Energy-Guided Policy Optimization for Offline Reinforcement Learning

NeurIPS 2025arXiv
0
citations

Improving Complex Reasoning over Knowledge Graph with Logic-Aware Curriculum Tuning

AAAI 2025arXiv
0
citations

Modeling All Response Surfaces in One for Conditional Search Spaces

AAAI 2025arXiv
0
citations

TD²-Net: Toward Denoising and Debiasing for Video Scene Graph Generation

AAAI 2024
0
citations

Multi-Step Denoising Scheduled Sampling: Towards Alleviating Exposure Bias for Diffusion Models

AAAI 2024
0
citations

Sheared Backpropagation for Fine-tuning Foundation Models

CVPR 2024
0
citations

UniMix: Towards Domain Adaptive and Generalizable LiDAR Semantic Segmentation in Adverse Weather

CVPR 2024arXiv
0
citations

FREE: Faster and Better Data-Free Meta-Learning

CVPR 2024arXiv
0
citations

Local-consistent Transformation Learning for Rotation-invariant Point Cloud Analysis

CVPR 2024arXiv
0
citations

Learn from Downstream and Be Yourself in Multimodal Large Language Models Fine-Tuning

ICML 2025arXiv
0
citations

HarmoDT: Harmony Multi-Task Decision Transformer for Offline Reinforcement Learning

ICML 2024arXiv
0
citations

Q-value Regularized Transformer for Offline Reinforcement Learning

ICML 2024arXiv
0
citations

Towards Theoretical Understandings of Self-Consuming Generative Models

ICML 2024arXiv
0
citations

Sparse Model Inversion: Efficient Inversion of Vision Transformers for Data-Free Applications

ICML 2024arXiv
0
citations

Merging Multi-Task Models via Weight-Ensembling Mixture of Experts

ICML 2024arXiv
0
citations

Generalization Analysis of Stochastic Weight Averaging with General Sampling

ICML 2024
0
citations

Task Groupings Regularization: Data-Free Meta-Learning with Heterogeneous Pre-trained Models

ICML 2024arXiv
0
citations

Representation Surgery for Multi-Task Model Merging

ICML 2024arXiv
0
citations

Confronting Reward Overoptimization for Diffusion Models: A Perspective of Inductive and Primacy Biases

ICML 2024arXiv
0
citations

GPS-Net: Graph Property Sensing Network for Scene Graph Generation

CVPR 2020
0
citations

Recurrent Feature Reasoning for Image Inpainting

CVPR 2020arXiv
0
citations

On Positive-Unlabeled Classification in GAN

CVPR 2020arXiv
0
citations

Distilling Knowledge From Graph Convolutional Networks

CVPR 2020arXiv
0
citations

Learning Oracle Attention for High-Fidelity Face Completion

CVPR 2020arXiv
0
citations

Syntax-Aware Action Targeting for Video Captioning

CVPR 2020
0
citations

Context Aware Graph Convolution for Skeleton-Based Action Recognition

CVPR 2020
0
citations

FeatureFlow: Robust Video Interpolation via Structure-to-Texture Generation

CVPR 2020
0
citations

Learning Unseen Concepts via Hierarchical Decomposition and Composition

CVPR 2020
0
citations

PuppeteerGAN: Arbitrary Portrait Animation With Semantic-Aware Appearance Transformation

CVPR 2020
0
citations

AdderSR: Towards Energy Efficient Image Super-Resolution

CVPR 2021arXiv
0
citations

Online Multiple Object Tracking With Cross-Task Synergy

CVPR 2021arXiv
0
citations

Scene Essence

CVPR 2021
0
citations

HourNAS: Extremely Fast Neural Architecture Search Through an Hourglass Lens

CVPR 2021arXiv
0
citations

Tree-Like Decision Distillation

CVPR 2021
0
citations

Learning Progressive Point Embeddings for 3D Point Cloud Generation

CVPR 2021
0
citations

Turning Frequency to Resolution: Video Super-Resolution via Event Cameras

CVPR 2021
0
citations

Glance and Gaze: Inferring Action-Aware Points for One-Stage Human-Object Interaction Detection

CVPR 2021arXiv
0
citations

Where and What? Examining Interpretable Disentangled Representations

CVPR 2021arXiv
0
citations

Detecting Human-Object Interaction via Fabricated Compositional Learning

CVPR 2021arXiv
0
citations

Affordance Transfer Learning for Human-Object Interaction Detection

CVPR 2021arXiv
0
citations

Manifold Regularized Dynamic Network Pruning

CVPR 2021arXiv
0
citations

Amalgamating Knowledge From Heterogeneous Graph Neural Networks

CVPR 2021
0
citations

Contrastive Boundary Learning for Point Cloud Segmentation

CVPR 2022arXiv
0
citations

Alleviating Semantics Distortion in Unsupervised Low-Level Image-to-Image Translation via Structure Consistency Constraint

CVPR 2022
0
citations

BatchFormer: Learning To Explore Sample Relationships for Robust Representation Learning

CVPR 2022arXiv
0
citations

DearKD: Data-Efficient Early Knowledge Distillation for Vision Transformers

CVPR 2022arXiv
0
citations

GMFlow: Learning Optical Flow via Global Matching

CVPR 2022arXiv
0
citations

Recurrent Glimpse-Based Decoder for Detection With Transformer

CVPR 2022arXiv
0
citations

Learning To Collaborate in Decentralized Learning of Personalized Models

CVPR 2022
0
citations

ART-Point: Improving Rotation Robustness of Point Cloud Classifiers via Adversarial Rotation

CVPR 2022
0
citations

Source-Free Domain Adaptation via Distribution Estimation

CVPR 2022arXiv
0
citations

Distillation Using Oracle Queries for Transformer-Based Human-Object Interaction Detection

CVPR 2022
0
citations

Defensive Patches for Robust Recognition in the Physical World

CVPR 2022arXiv
0
citations

HL-Net: Heterophily Learning Network for Scene Graph Generation

CVPR 2022
0
citations

Modeling Image Composition for Complex Scene Generation

CVPR 2022arXiv
0
citations

Learning Affordance Grounding From Exocentric Images

CVPR 2022arXiv
0
citations

Few-Shot Backdoor Defense Using Shapley Estimation

CVPR 2022arXiv
0
citations

Patch Slimming for Efficient Vision Transformers

CVPR 2022arXiv
0
citations

RU-Net: Regularized Unrolling Network for Scene Graph Generation

CVPR 2022
0
citations

Continual Learning With Lifelong Vision Transformer

CVPR 2022
0
citations

Self-Augmented Unpaired Image Dehazing via Density and Depth Decomposition

CVPR 2022
0
citations

FIBA: Frequency-Injection Based Backdoor Attack in Medical Image Analysis

CVPR 2022arXiv
0
citations

Bridged Transformer for Vision and Point Cloud 3D Object Detection

CVPR 2022
0
citations

Fine-Tuning Global Model via Data-Free Knowledge Distillation for Non-IID Federated Learning

CVPR 2022arXiv
0
citations

Dynamic Focus-Aware Positional Queries for Semantic Segmentation

CVPR 2023arXiv
0
citations

Leverage Interactive Affinity for Affordance Learning

CVPR 2023
0
citations

Upcycling Models Under Domain and Category Shift

CVPR 2023
0
citations

Learnable Skeleton-Aware 3D Point Cloud Sampling

CVPR 2023
0
citations

CLAMP: Prompt-Based Contrastive Learning for Connecting Language and Animal Pose

CVPR 2023arXiv
0
citations

Exploring the Relationship Between Architectural Design and Adversarially Robust Generalization

CVPR 2023
0
citations

Generating Holistic 3D Human Motion From Speech

CVPR 2023arXiv
0
citations

Architecture, Dataset and Model-Scale Agnostic Data-Free Meta-Learning

CVPR 2023arXiv
0
citations

DeepSolo: Let Transformer Decoder With Explicit Points Solo for Text Spotting

CVPR 2023arXiv
0
citations

Make Landscape Flatter in Differentially Private Federated Learning

CVPR 2023arXiv
0
citations

From Images to Textual Prompts: Zero-Shot Visual Question Answering With Frozen Large Language Models

CVPR 2023
0
citations

Deep Graph Reprogramming

CVPR 2023arXiv
0
citations

TriDet: Temporal Action Detection With Relative Boundary Modeling

CVPR 2023arXiv
0
citations

Referring Image Matting

CVPR 2023arXiv
0
citations

Out-of-Boundary View Synthesis Towards Full-Frame Video Stabilization

ICCV 2021arXiv
0
citations