Tong Wu
30
Papers
106
Total Citations
Papers (30)
GPT4Point: A Unified Framework for Point-Language Understanding and Generation
CVPR 2024
62
citations
IDArb: Intrinsic Decomposition for Arbitrary Number of Input Views and Illuminations
ICLR 2025
15
citations
FSFM: A Generalizable Face Security Foundation Model via Self-Supervised Facial Representation Learning
CVPR 2025
11
citations
GenDoP: Auto-regressive Camera Trajectory Generation as a Director of Photography
ICCV 2025
7
citations
HiFlow: Training-free High-Resolution Image Generation with Flow-Aligned Guidance
NeurIPS 2025
6
citations
Sensing Surface Patches in Volume Rendering for Inferring Signed Distance Functions
AAAI 2025
2
citations
Bootstrap3D: Improving Multi-view Diffusion Model with Synthetic Data
ICCV 2025
2
citations
ReliabilityRAG: Effective and Provably Robust Defense for RAG-based Web-Search
NeurIPS 2025
1
citations
Alpha-CLIP: A CLIP Model Focusing on Wherever You Want
CVPR 2024
0
citations
Adversarial Robustness Under Long-Tailed Distribution
CVPR 2021arXiv
0
citations
Embedded Discriminative Attention Mechanism for Weakly Supervised Semantic Segmentation
CVPR 2021
0
citations
Towards Evaluating and Training Verifiably Robust Neural Networks
CVPR 2021arXiv
0
citations
OmniObject3D: Large-Vocabulary 3D Object Dataset for Realistic Perception, Reconstruction and Generation
CVPR 2023arXiv
0
citations
SLAN: Self-Locator Aided Network for Vision-Language Understanding
ICCV 2023
0
citations
V3Det: Vast Vocabulary Visual Detection Dataset
ICCV 2023arXiv
0
citations
Distribution-Balanced Loss for Multi-Label Classification in Long-Tailed Datasets
ECCV 2020
0
citations
Caption-Supervised Face Recognition: Training a State-of-the-Art Face Model without Manual Annotation
ECCV 2020
0
citations
Adaptive Spatial-BCE Loss for Weakly Supervised Semantic Segmentation
ECCV 2022
0
citations
GPT-4V(ision) is a Human-Aligned Evaluator for Text-to-3D Generation
CVPR 2024
0
citations
3DTopia-XL: Scaling High-quality 3D Asset Generation via Primitive Diffusion
CVPR 2025
0
citations
OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts
CVPR 2025
0
citations
ByTheWay: Boost Your Text-to-Video Generation Model to Higher Quality in a Training-free Way
CVPR 2025
0
citations
Light-A-Video: Training-free Video Relighting via Progressive Light Fusion
ICCV 2025
0
citations
X-Prompt: Generalizable Auto-Regressive Visual Learning with In-Context Prompting
ICCV 2025
0
citations
An Efficient Hybrid Vision Transformer for TinyML Applications
ICCV 2025
0
citations
EventPillars: Pillar-based Efficient Representations for Event Data
AAAI 2025
0
citations
Few-Shot Object Detection via Association and DIscrimination
NeurIPS 2021
0
citations
Balanced Chamfer Distance as a Comprehensive Metric for Point Cloud Completion
NeurIPS 2021
0
citations
A Randomized Approach to Tight Privacy Accounting
NeurIPS 2023
0
citations
AR-Diffusion: Auto-Regressive Diffusion Model for Text Generation
NeurIPS 2023
0
citations