Yu-Gang Jiang
28
Papers
659
Total Citations
Papers (28)
NuScenes-QA: A Multi-Modal Visual Question Answering Benchmark for Autonomous Driving
AAAI 2024arXiv
266
citations
SimDA: Simple Diffusion Adapter for Efficient Video Generation
CVPR 2024
106
citations
Adversarial Prompt Tuning for Vision-Language Models
ECCV 2024
33
citations
CreatiLayout: Siamese Multimodal Diffusion Transformer for Creative Layout-to-Image Generation
ICCV 2025arXiv
33
citations
OmniViD: A Generative Framework for Universal Video Understanding
CVPR 2024
29
citations
Doubly Abductive Counterfactual Inference for Text-based Image Editing
CVPR 2024
25
citations
AID: Adapting Image2Video Diffusion Models for Instruction-guided Video Prediction
ICCV 2025
24
citations
MotionFollower: Editing Video Motion via Score-Guided Diffusion
ICCV 2025
22
citations
PromptFusion: Decoupling Stability and Plasticity for Continual Learning
ECCV 2024
21
citations
AdaDiff: Adaptive Step Selection for Fast Diffusion Models
AAAI 2025
19
citations
LRANet: Towards Accurate and Efficient Scene Text Detection with Low-Rank Approximation
AAAI 2024arXiv
17
citations
BlueSuffix: Reinforced Blue Teaming for Vision-Language Models Against Jailbreak Attacks
ICLR 2025
16
citations
Unlocking Textual and Visual Wisdom: Open-Vocabulary 3D Object Detection Enhanced by Comprehensive Guidance from Text and Image
ECCV 2024
12
citations
DuMo: Dual Encoder Modulation Network for Precise Concept Erasure
AAAI 2025
7
citations
Out of Length Text Recognition with Sub-String Matching
AAAI 2025
7
citations
Learning to Rank Patches for Unbiased Image Redundancy Reduction
CVPR 2024
6
citations
REDUCIO! Generating 1K Video within 16 Seconds using Extremely Compressed Motion Latents
ICCV 2025
5
citations
Towards Omnimodal Expressions and Reasoning in Referring Audio-Visual Segmentation
ICCV 2025arXiv
5
citations
AIM: Additional Image Guided Generation of Transferable Adversarial Attacks
AAAI 2025
3
citations
FaceA-Net: Facial Attribute-Driven ID Preserving Image Generation Network
AAAI 2025
1
citations
From Holistic to Localized: Local Enhanced Adapters for Efficient Visual Instruction Fine-Tuning
ICCV 2025
1
citations
Achieving More with Less: Additive Prompt Tuning for Rehearsal-Free Class-Incremental Learning
ICCV 2025
1
citations
MotionEditor: Editing Video Motion via Content-Aware Diffusion
CVPR 2024
0
citations
VLABench: A Large-Scale Benchmark for Language-Conditioned Robotics Manipulation with Long-Horizon Reasoning Tasks
ICCV 2025
0
citations
SVTRv2: CTC Beats Encoder-Decoder Models in Scene Text Recognition
ICCV 2025
0
citations
IDEATOR: Jailbreaking and Benchmarking Large Vision-Language Models Using Themselves
ICCV 2025
0
citations
Comprehensive Multi-Modal Prototypes Are Simple and Effective Classifiers for Vast-Vocabulary Object Detection
AAAI 2025
0
citations
Instance-Aware Multi-Camera 3D Object Detection with Structural Priors Mining and Self-Boosting Learning
AAAI 2024arXiv
0
citations