Pan Zhou
57
Papers
230
Total Citations
Papers (57)
Let's Think Outside the Box: Exploring Leap-of-Thought in Large Language Models with Creative Humor Generation
CVPR 2024
65
citations
Consistent3D: Towards Consistent High-Fidelity Text-to-3D Generation with Deterministic Sampling Prior
CVPR 2024
51
citations
Diffusion Time-step Curriculum for One Image to 3D Generation
CVPR 2024
24
citations
GUI-World: A Video Benchmark and Dataset for Multimodal GUI-oriented Understanding
ICLR 2025
23
citations
Physical Backdoor: Towards Temperature-based Backdoor Attacks in the Physical World
CVPR 2024
11
citations
BadVLA: Towards Backdoor Attacks on Vision-Language-Action Models via Objective-Decoupled Optimization
NeurIPS 2025
11
citations
Hiding Imperceptible Noise in Curvature-Aware Patches for 3D Point Cloud Attack
ECCV 2024
10
citations
BadToken: Token-level Backdoor Attacks to Multi-modal Large Language Models
CVPR 2025
9
citations
Multi-Pair Temporal Sentence Grounding via Multi-Thread Knowledge Transfer Network
AAAI 2025
9
citations
CaPo: Cooperative Plan Optimization for Efficient Embodied Multi-Agent Cooperation
ICLR 2025
8
citations
Zeroth-Order Fine-Tuning of LLMs in Random Subspaces
ICCV 2025
6
citations
Towards Understanding Why FixMatch Generalizes Better Than Supervised Learning
ICLR 2025
2
citations
Probabilistic Prototype Calibration of Vision-language Models for Generalized Few-shot Semantic Segmentation
ICCV 2025
1
citations
InceptionNeXt: When Inception Meets ConvNeXt
CVPR 2024
0
citations
Position: Exploring the Robustness of Pipeline-Parallelism-Based Decentralized Training
ICML 2024
0
citations
MLLM-as-a-Judge: Assessing Multimodal LLM-as-a-Judge with Vision-Language Benchmark
ICML 2024
0
citations
Outlier-Robust Tensor PCA
CVPR 2017
0
citations
Deep Adversarial Subspace Clustering
CVPR 2018
0
citations
MHP-VOS: Multiple Hypotheses Propagation for Video Object Segmentation
CVPR 2019
0
citations
Context-Aware Biaffine Localizing Network for Temporal Sentence Grounding
CVPR 2021arXiv
0
citations
MetaFormer Is Actually What You Need for Vision
CVPR 2022arXiv
0
citations
Bandits for Structure Perturbation-Based Black-Box Attacks To Graph Neural Networks With Theoretical Guarantees
CVPR 2022arXiv
0
citations
Position-Guided Text Prompt for Vision-Language Pre-Training
CVPR 2023arXiv
0
citations
You Can Ground Earlier Than See: An Effective and Efficient Pipeline for Temporal Sentence Grounding in Compressed Videos
CVPR 2023arXiv
0
citations
You Are Catching My Attention: Are Vision Transformers Bad Learners Under Backdoor Attacks?
CVPR 2023
0
citations
Jointly Attentive Spatial-Temporal Pooling Networks for Video-Based Person Re-Identification
ICCV 2017arXiv
0
citations
STPrivacy: Spatio-Temporal Privacy-Preserving Action Recognition
ICCV 2023arXiv
0
citations
3DHacker: Spectrum-based Decision Boundary Generation for Hard-label 3D Point Cloud Attack
ICCV 2023arXiv
0
citations
Masked Diffusion Transformer is a Strong Image Synthesizer
ICCV 2023arXiv
0
citations
Self-Promoted Supervision for Few-Shot Transformer
ECCV 2022
0
citations
DualFormer: Local-Global Stratified Transformer for Efficient Video Recognition
ECCV 2022
0
citations
Video Graph Transformer for Video Question Answering
ECCV 2022
0
citations
Unsupervised Domain Adaptative Temporal Sentence Localization with Mutual Information Maximization
AAAI 2024
0
citations
Memory-Efficient 4-bit Preconditioned Stochastic Optimization
ICCV 2025
0
citations
Can LLMs Correct Themselves? A Benchmark of Self-Correction in LLMs
NeurIPS 2025
0
citations
Graph Agent Network: Empowering Nodes with Inference Capabilities for Adversarial Resilience
AAAI 2025
0
citations
Grimm: A Plug-and-Play Perturbation Rectifier for Graph Neural Networks Defending Against Poisoning Attacks
AAAI 2025
0
citations
Collaborative Tree Search for Enhancing Embodied Multi-Agent Collaboration
CVPR 2025
0
citations
What Makes Good Collaborative Views? Contrastive Mutual Information Maximization for Multi-Agent Perception
AAAI 2024
0
citations
Towards Inductive Robustness: Distilling and Fostering Wave-Induced Resonance in Transductive GCNs against Graph Adversarial Attacks
AAAI 2024arXiv
0
citations
Fewer Steps, Better Performance: Efficient Cross-Modal Clip Trimming for Video Moment Retrieval Using Language
AAAI 2024
0
citations
Sparse Enhanced Network: An Adversarial Generation Method for Robust Augmentation in Sequential Recommendation
AAAI 2024
0
citations
Few-shot Learner Parameterization by Diffusion Time-steps
CVPR 2024
0
citations
MetaCloak: Preventing Unauthorized Subject-driven Text-to-image Diffusion-based Synthesis via Meta-learning
CVPR 2024
0
citations
Friendly Sharpness-Aware Minimization
CVPR 2024
0
citations
New Insight into Hybrid Stochastic Gradient Descent: Beyond With-Replacement Sampling and Convexity
NeurIPS 2018
0
citations
Efficient Stochastic Gradient Hard Thresholding
NeurIPS 2018
0
citations
Efficient Meta Learning via Minibatch Proximal Update
NeurIPS 2019
0
citations
Improving GAN Training with Probability Ratio Clipping and Sample Reweighting
NeurIPS 2020
0
citations
Theory-Inspired Path-Regularized Differential Network Architecture Search
NeurIPS 2020
0
citations
Towards Theoretically Understanding Why Sgd Generalizes Better Than Adam in Deep Learning
NeurIPS 2020
0
citations
A Theory-Driven Self-Labeling Refinement Method for Contrastive Representation Learning
NeurIPS 2021
0
citations
TRS: Transferability Reduced Ensemble via Promoting Gradient Diversity and Model Smoothness
NeurIPS 2021
0
citations
Towards Understanding Why Lookahead Generalizes Better Than SGD and Beyond
NeurIPS 2021
0
citations
Inception Transformer
NeurIPS 2022arXiv
0
citations
ScaleLong: Towards More Stable Training of Diffusion Model via Scaling Network Long Skip Connection
NeurIPS 2023
0
citations
Understanding Generalization and Optimization Performance of Deep CNNs
ICML 2018
0
citations