Tong Wu

30
Papers
106
Total Citations

Papers (30)

GPT4Point: A Unified Framework for Point-Language Understanding and Generation

CVPR 2024
62
citations

IDArb: Intrinsic Decomposition for Arbitrary Number of Input Views and Illuminations

ICLR 2025
15
citations

FSFM: A Generalizable Face Security Foundation Model via Self-Supervised Facial Representation Learning

CVPR 2025
11
citations

GenDoP: Auto-regressive Camera Trajectory Generation as a Director of Photography

ICCV 2025
7
citations

HiFlow: Training-free High-Resolution Image Generation with Flow-Aligned Guidance

NeurIPS 2025
6
citations

Sensing Surface Patches in Volume Rendering for Inferring Signed Distance Functions

AAAI 2025
2
citations

Bootstrap3D: Improving Multi-view Diffusion Model with Synthetic Data

ICCV 2025
2
citations

ReliabilityRAG: Effective and Provably Robust Defense for RAG-based Web-Search

NeurIPS 2025
1
citations

Alpha-CLIP: A CLIP Model Focusing on Wherever You Want

CVPR 2024
0
citations

Adversarial Robustness Under Long-Tailed Distribution

CVPR 2021arXiv
0
citations

Embedded Discriminative Attention Mechanism for Weakly Supervised Semantic Segmentation

CVPR 2021
0
citations

Towards Evaluating and Training Verifiably Robust Neural Networks

CVPR 2021arXiv
0
citations

OmniObject3D: Large-Vocabulary 3D Object Dataset for Realistic Perception, Reconstruction and Generation

CVPR 2023arXiv
0
citations

SLAN: Self-Locator Aided Network for Vision-Language Understanding

ICCV 2023
0
citations

V3Det: Vast Vocabulary Visual Detection Dataset

ICCV 2023arXiv
0
citations

Distribution-Balanced Loss for Multi-Label Classification in Long-Tailed Datasets

ECCV 2020
0
citations

Caption-Supervised Face Recognition: Training a State-of-the-Art Face Model without Manual Annotation

ECCV 2020
0
citations

Adaptive Spatial-BCE Loss for Weakly Supervised Semantic Segmentation

ECCV 2022
0
citations

GPT-4V(ision) is a Human-Aligned Evaluator for Text-to-3D Generation

CVPR 2024
0
citations

3DTopia-XL: Scaling High-quality 3D Asset Generation via Primitive Diffusion

CVPR 2025
0
citations

OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts

CVPR 2025
0
citations

ByTheWay: Boost Your Text-to-Video Generation Model to Higher Quality in a Training-free Way

CVPR 2025
0
citations

Light-A-Video: Training-free Video Relighting via Progressive Light Fusion

ICCV 2025
0
citations

X-Prompt: Generalizable Auto-Regressive Visual Learning with In-Context Prompting

ICCV 2025
0
citations

An Efficient Hybrid Vision Transformer for TinyML Applications

ICCV 2025
0
citations

EventPillars: Pillar-based Efficient Representations for Event Data

AAAI 2025
0
citations

Few-Shot Object Detection via Association and DIscrimination

NeurIPS 2021
0
citations

Balanced Chamfer Distance as a Comprehensive Metric for Point Cloud Completion

NeurIPS 2021
0
citations

A Randomized Approach to Tight Privacy Accounting

NeurIPS 2023
0
citations

AR-Diffusion: Auto-Regressive Diffusion Model for Text Generation

NeurIPS 2023
0
citations