Dacheng Tao
100
Papers
398
Total Citations
Papers (100)
R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization
ICCV 2025arXiv
206
citations
Divide, Conquer and Combine: A Training-Free Framework for High-Resolution Image Perception in Multimodal Large Language Models
AAAI 2025arXiv
73
citations
Revisiting Backdoor Attacks against Large Vision-Language Models from Domain Shift
CVPR 2025arXiv
25
citations
Revisiting Plasticity in Visual Reinforcement Learning: Data, Modules and Training Stages
ICLR 2024arXiv
25
citations
SimDistill: Simulated Multi-Modal Distillation for BEV 3D Object Detection
AAAI 2024arXiv
24
citations
One More Step: A Versatile Plug-and-Play Module for Rectifying Diffusion Schedule Flaws and Enhancing Low-Frequency Controls
CVPR 2024arXiv
9
citations
MMReason: An Open-Ended Multi-Modal Multi-Step Reasoning Benchmark for MLLMs Toward AGI
ICCV 2025
7
citations
Synergy of Sight and Semantics: Visual Intention Understanding with CLIP
ECCV 2024
7
citations
Network Sparsity Unlocks the Scaling Potential of Deep Reinforcement Learning
ICML 2025arXiv
6
citations
Free-Form Motion Control: Controlling the 6D Poses of Camera and Objects in Video Generation
ICCV 2025arXiv
4
citations
Mastering Massive Multi-Task Reinforcement Learning via Mixture-of-Expert Decision Transformer
ICML 2025arXiv
4
citations
Adaptive Defense against Harmful Fine-Tuning for Large Language Models via Bayesian Data Scheduler
NeurIPS 2025arXiv
2
citations
Learning system dynamics without forgetting
ICLR 2025
2
citations
ICLShield: Exploring and Mitigating In-Context Learning Backdoor Attacks
ICML 2025arXiv
2
citations
AiDE-Q: Synthetic Labeled Datasets Can Enhance Learning Models for Quantum Property Estimation
NeurIPS 2025arXiv
1
citations
LLM Data Selection and Utilization via Dynamic Bi-level Optimization
ICML 2025arXiv
1
citations
LoRA Recycle: Unlocking Tuning-Free Few-Shot Adaptability in Visual Foundation Models by Recycling Pre-Tuned LoRAs
CVPR 2025
0
citations
Neuron: Learning Context-Aware Evolving Representations for Zero-Shot Skeleton Action Recognition
CVPR 2025arXiv
0
citations
Harnessing Text-to-Image Diffusion Models for Point Cloud Self-Supervised Learning
ICCV 2025arXiv
0
citations
CopyrightShield: Enhancing Diffusion Model Security Against Copyright Infringement Attacks
ICCV 2025
0
citations
Rethink Sparse Signals for Pose-guided Text-to-image Generation
ICCV 2025arXiv
0
citations
Analytic Energy-Guided Policy Optimization for Offline Reinforcement Learning
NeurIPS 2025arXiv
0
citations
Improving Complex Reasoning over Knowledge Graph with Logic-Aware Curriculum Tuning
AAAI 2025arXiv
0
citations
Modeling All Response Surfaces in One for Conditional Search Spaces
AAAI 2025arXiv
0
citations
TD²-Net: Toward Denoising and Debiasing for Video Scene Graph Generation
AAAI 2024
0
citations
Multi-Step Denoising Scheduled Sampling: Towards Alleviating Exposure Bias for Diffusion Models
AAAI 2024
0
citations
Sheared Backpropagation for Fine-tuning Foundation Models
CVPR 2024
0
citations
UniMix: Towards Domain Adaptive and Generalizable LiDAR Semantic Segmentation in Adverse Weather
CVPR 2024arXiv
0
citations
FREE: Faster and Better Data-Free Meta-Learning
CVPR 2024arXiv
0
citations
Local-consistent Transformation Learning for Rotation-invariant Point Cloud Analysis
CVPR 2024arXiv
0
citations
Learn from Downstream and Be Yourself in Multimodal Large Language Models Fine-Tuning
ICML 2025arXiv
0
citations
HarmoDT: Harmony Multi-Task Decision Transformer for Offline Reinforcement Learning
ICML 2024arXiv
0
citations
Q-value Regularized Transformer for Offline Reinforcement Learning
ICML 2024arXiv
0
citations
Towards Theoretical Understandings of Self-Consuming Generative Models
ICML 2024arXiv
0
citations
Sparse Model Inversion: Efficient Inversion of Vision Transformers for Data-Free Applications
ICML 2024arXiv
0
citations
Merging Multi-Task Models via Weight-Ensembling Mixture of Experts
ICML 2024arXiv
0
citations
Generalization Analysis of Stochastic Weight Averaging with General Sampling
ICML 2024
0
citations
Task Groupings Regularization: Data-Free Meta-Learning with Heterogeneous Pre-trained Models
ICML 2024arXiv
0
citations
Representation Surgery for Multi-Task Model Merging
ICML 2024arXiv
0
citations
Confronting Reward Overoptimization for Diffusion Models: A Perspective of Inductive and Primacy Biases
ICML 2024arXiv
0
citations
GPS-Net: Graph Property Sensing Network for Scene Graph Generation
CVPR 2020
0
citations
Recurrent Feature Reasoning for Image Inpainting
CVPR 2020arXiv
0
citations
On Positive-Unlabeled Classification in GAN
CVPR 2020arXiv
0
citations
Distilling Knowledge From Graph Convolutional Networks
CVPR 2020arXiv
0
citations
Learning Oracle Attention for High-Fidelity Face Completion
CVPR 2020arXiv
0
citations
Syntax-Aware Action Targeting for Video Captioning
CVPR 2020
0
citations
Context Aware Graph Convolution for Skeleton-Based Action Recognition
CVPR 2020
0
citations
FeatureFlow: Robust Video Interpolation via Structure-to-Texture Generation
CVPR 2020
0
citations
Learning Unseen Concepts via Hierarchical Decomposition and Composition
CVPR 2020
0
citations
PuppeteerGAN: Arbitrary Portrait Animation With Semantic-Aware Appearance Transformation
CVPR 2020
0
citations
AdderSR: Towards Energy Efficient Image Super-Resolution
CVPR 2021arXiv
0
citations
Online Multiple Object Tracking With Cross-Task Synergy
CVPR 2021arXiv
0
citations
Scene Essence
CVPR 2021
0
citations
HourNAS: Extremely Fast Neural Architecture Search Through an Hourglass Lens
CVPR 2021arXiv
0
citations
Tree-Like Decision Distillation
CVPR 2021
0
citations
Learning Progressive Point Embeddings for 3D Point Cloud Generation
CVPR 2021
0
citations
Turning Frequency to Resolution: Video Super-Resolution via Event Cameras
CVPR 2021
0
citations
Glance and Gaze: Inferring Action-Aware Points for One-Stage Human-Object Interaction Detection
CVPR 2021arXiv
0
citations
Where and What? Examining Interpretable Disentangled Representations
CVPR 2021arXiv
0
citations
Detecting Human-Object Interaction via Fabricated Compositional Learning
CVPR 2021arXiv
0
citations
Affordance Transfer Learning for Human-Object Interaction Detection
CVPR 2021arXiv
0
citations
Manifold Regularized Dynamic Network Pruning
CVPR 2021arXiv
0
citations
Amalgamating Knowledge From Heterogeneous Graph Neural Networks
CVPR 2021
0
citations
Contrastive Boundary Learning for Point Cloud Segmentation
CVPR 2022arXiv
0
citations
Alleviating Semantics Distortion in Unsupervised Low-Level Image-to-Image Translation via Structure Consistency Constraint
CVPR 2022
0
citations
BatchFormer: Learning To Explore Sample Relationships for Robust Representation Learning
CVPR 2022arXiv
0
citations
DearKD: Data-Efficient Early Knowledge Distillation for Vision Transformers
CVPR 2022arXiv
0
citations
GMFlow: Learning Optical Flow via Global Matching
CVPR 2022arXiv
0
citations
Recurrent Glimpse-Based Decoder for Detection With Transformer
CVPR 2022arXiv
0
citations
Learning To Collaborate in Decentralized Learning of Personalized Models
CVPR 2022
0
citations
ART-Point: Improving Rotation Robustness of Point Cloud Classifiers via Adversarial Rotation
CVPR 2022
0
citations
Source-Free Domain Adaptation via Distribution Estimation
CVPR 2022arXiv
0
citations
Distillation Using Oracle Queries for Transformer-Based Human-Object Interaction Detection
CVPR 2022
0
citations
Defensive Patches for Robust Recognition in the Physical World
CVPR 2022arXiv
0
citations
HL-Net: Heterophily Learning Network for Scene Graph Generation
CVPR 2022
0
citations
Modeling Image Composition for Complex Scene Generation
CVPR 2022arXiv
0
citations
Learning Affordance Grounding From Exocentric Images
CVPR 2022arXiv
0
citations
Few-Shot Backdoor Defense Using Shapley Estimation
CVPR 2022arXiv
0
citations
Patch Slimming for Efficient Vision Transformers
CVPR 2022arXiv
0
citations
RU-Net: Regularized Unrolling Network for Scene Graph Generation
CVPR 2022
0
citations
Continual Learning With Lifelong Vision Transformer
CVPR 2022
0
citations
Self-Augmented Unpaired Image Dehazing via Density and Depth Decomposition
CVPR 2022
0
citations
FIBA: Frequency-Injection Based Backdoor Attack in Medical Image Analysis
CVPR 2022arXiv
0
citations
Bridged Transformer for Vision and Point Cloud 3D Object Detection
CVPR 2022
0
citations
Fine-Tuning Global Model via Data-Free Knowledge Distillation for Non-IID Federated Learning
CVPR 2022arXiv
0
citations
Dynamic Focus-Aware Positional Queries for Semantic Segmentation
CVPR 2023arXiv
0
citations
Leverage Interactive Affinity for Affordance Learning
CVPR 2023
0
citations
Upcycling Models Under Domain and Category Shift
CVPR 2023
0
citations
Learnable Skeleton-Aware 3D Point Cloud Sampling
CVPR 2023
0
citations
CLAMP: Prompt-Based Contrastive Learning for Connecting Language and Animal Pose
CVPR 2023arXiv
0
citations
Exploring the Relationship Between Architectural Design and Adversarially Robust Generalization
CVPR 2023
0
citations
Generating Holistic 3D Human Motion From Speech
CVPR 2023arXiv
0
citations
Architecture, Dataset and Model-Scale Agnostic Data-Free Meta-Learning
CVPR 2023arXiv
0
citations
DeepSolo: Let Transformer Decoder With Explicit Points Solo for Text Spotting
CVPR 2023arXiv
0
citations
Make Landscape Flatter in Differentially Private Federated Learning
CVPR 2023arXiv
0
citations
From Images to Textual Prompts: Zero-Shot Visual Question Answering With Frozen Large Language Models
CVPR 2023
0
citations
Deep Graph Reprogramming
CVPR 2023arXiv
0
citations
TriDet: Temporal Action Detection With Relative Boundary Modeling
CVPR 2023arXiv
0
citations
Referring Image Matting
CVPR 2023arXiv
0
citations
Out-of-Boundary View Synthesis Towards Full-Frame Video Stabilization
ICCV 2021arXiv
0
citations