Pan Zhou

57
Papers
230
Total Citations

Papers (57)

Let's Think Outside the Box: Exploring Leap-of-Thought in Large Language Models with Creative Humor Generation

CVPR 2024
65
citations

Consistent3D: Towards Consistent High-Fidelity Text-to-3D Generation with Deterministic Sampling Prior

CVPR 2024
51
citations

Diffusion Time-step Curriculum for One Image to 3D Generation

CVPR 2024
24
citations

GUI-World: A Video Benchmark and Dataset for Multimodal GUI-oriented Understanding

ICLR 2025
23
citations

Physical Backdoor: Towards Temperature-based Backdoor Attacks in the Physical World

CVPR 2024
11
citations

BadVLA: Towards Backdoor Attacks on Vision-Language-Action Models via Objective-Decoupled Optimization

NeurIPS 2025
11
citations

Hiding Imperceptible Noise in Curvature-Aware Patches for 3D Point Cloud Attack

ECCV 2024
10
citations

BadToken: Token-level Backdoor Attacks to Multi-modal Large Language Models

CVPR 2025
9
citations

Multi-Pair Temporal Sentence Grounding via Multi-Thread Knowledge Transfer Network

AAAI 2025
9
citations

CaPo: Cooperative Plan Optimization for Efficient Embodied Multi-Agent Cooperation

ICLR 2025
8
citations

Zeroth-Order Fine-Tuning of LLMs in Random Subspaces

ICCV 2025
6
citations

Towards Understanding Why FixMatch Generalizes Better Than Supervised Learning

ICLR 2025
2
citations

Probabilistic Prototype Calibration of Vision-language Models for Generalized Few-shot Semantic Segmentation

ICCV 2025
1
citations

InceptionNeXt: When Inception Meets ConvNeXt

CVPR 2024
0
citations

Position: Exploring the Robustness of Pipeline-Parallelism-Based Decentralized Training

ICML 2024
0
citations

MLLM-as-a-Judge: Assessing Multimodal LLM-as-a-Judge with Vision-Language Benchmark

ICML 2024
0
citations

Outlier-Robust Tensor PCA

CVPR 2017
0
citations

Deep Adversarial Subspace Clustering

CVPR 2018
0
citations

MHP-VOS: Multiple Hypotheses Propagation for Video Object Segmentation

CVPR 2019
0
citations

Context-Aware Biaffine Localizing Network for Temporal Sentence Grounding

CVPR 2021arXiv
0
citations

MetaFormer Is Actually What You Need for Vision

CVPR 2022arXiv
0
citations

Bandits for Structure Perturbation-Based Black-Box Attacks To Graph Neural Networks With Theoretical Guarantees

CVPR 2022arXiv
0
citations

Position-Guided Text Prompt for Vision-Language Pre-Training

CVPR 2023arXiv
0
citations

You Can Ground Earlier Than See: An Effective and Efficient Pipeline for Temporal Sentence Grounding in Compressed Videos

CVPR 2023arXiv
0
citations

You Are Catching My Attention: Are Vision Transformers Bad Learners Under Backdoor Attacks?

CVPR 2023
0
citations

Jointly Attentive Spatial-Temporal Pooling Networks for Video-Based Person Re-Identification

ICCV 2017arXiv
0
citations

STPrivacy: Spatio-Temporal Privacy-Preserving Action Recognition

ICCV 2023arXiv
0
citations

3DHacker: Spectrum-based Decision Boundary Generation for Hard-label 3D Point Cloud Attack

ICCV 2023arXiv
0
citations

Masked Diffusion Transformer is a Strong Image Synthesizer

ICCV 2023arXiv
0
citations

Self-Promoted Supervision for Few-Shot Transformer

ECCV 2022
0
citations

DualFormer: Local-Global Stratified Transformer for Efficient Video Recognition

ECCV 2022
0
citations

Video Graph Transformer for Video Question Answering

ECCV 2022
0
citations

Unsupervised Domain Adaptative Temporal Sentence Localization with Mutual Information Maximization

AAAI 2024
0
citations

Memory-Efficient 4-bit Preconditioned Stochastic Optimization

ICCV 2025
0
citations

Can LLMs Correct Themselves? A Benchmark of Self-Correction in LLMs

NeurIPS 2025
0
citations

Graph Agent Network: Empowering Nodes with Inference Capabilities for Adversarial Resilience

AAAI 2025
0
citations

Grimm: A Plug-and-Play Perturbation Rectifier for Graph Neural Networks Defending Against Poisoning Attacks

AAAI 2025
0
citations

Collaborative Tree Search for Enhancing Embodied Multi-Agent Collaboration

CVPR 2025
0
citations

What Makes Good Collaborative Views? Contrastive Mutual Information Maximization for Multi-Agent Perception

AAAI 2024
0
citations

Towards Inductive Robustness: Distilling and Fostering Wave-Induced Resonance in Transductive GCNs against Graph Adversarial Attacks

AAAI 2024arXiv
0
citations

Fewer Steps, Better Performance: Efficient Cross-Modal Clip Trimming for Video Moment Retrieval Using Language

AAAI 2024
0
citations

Sparse Enhanced Network: An Adversarial Generation Method for Robust Augmentation in Sequential Recommendation

AAAI 2024
0
citations

Few-shot Learner Parameterization by Diffusion Time-steps

CVPR 2024
0
citations

MetaCloak: Preventing Unauthorized Subject-driven Text-to-image Diffusion-based Synthesis via Meta-learning

CVPR 2024
0
citations

Friendly Sharpness-Aware Minimization

CVPR 2024
0
citations

New Insight into Hybrid Stochastic Gradient Descent: Beyond With-Replacement Sampling and Convexity

NeurIPS 2018
0
citations

Efficient Stochastic Gradient Hard Thresholding

NeurIPS 2018
0
citations

Efficient Meta Learning via Minibatch Proximal Update

NeurIPS 2019
0
citations

Improving GAN Training with Probability Ratio Clipping and Sample Reweighting

NeurIPS 2020
0
citations

Theory-Inspired Path-Regularized Differential Network Architecture Search

NeurIPS 2020
0
citations

Towards Theoretically Understanding Why Sgd Generalizes Better Than Adam in Deep Learning

NeurIPS 2020
0
citations

A Theory-Driven Self-Labeling Refinement Method for Contrastive Representation Learning

NeurIPS 2021
0
citations

TRS: Transferability Reduced Ensemble via Promoting Gradient Diversity and Model Smoothness

NeurIPS 2021
0
citations

Towards Understanding Why Lookahead Generalizes Better Than SGD and Beyond

NeurIPS 2021
0
citations

Inception Transformer

NeurIPS 2022arXiv
0
citations

ScaleLong: Towards More Stable Training of Diffusion Model via Scaling Network Long Skip Connection

NeurIPS 2023
0
citations

Understanding Generalization and Optimization Performance of Deep CNNs

ICML 2018
0
citations