Tong He

50
Papers
230
Total Citations

Papers (50)

Pixel-GS Density Control with Pixel-aware Gradient for 3D Gaussian Splatting

ECCV 2024
96
citations

Aether: Geometric-Aware Unified World Modeling

ICCV 2025
47
citations

DreamComposer: Controllable 3D Object Generation via Multi-View Conditions

CVPR 2024
19
citations

VQ-VLA: Improving Vision-Language-Action Models via Scaling Vector-Quantized Action Tokenizers

ICCV 2025
16
citations

TASeg: Temporal Aggregation Network for LiDAR Semantic Segmentation

CVPR 2024
14
citations

Tra-MoE: Learning Trajectory Prediction Model from Multiple Domains for Adaptive Policy Conditioning

CVPR 2025
11
citations

S4-Driver: Scalable Self-Supervised Driving Multimodal Large Language Model with Spatio-Temporal Visual Representation

CVPR 2025
10
citations

Lumina-T2X: Scalable Flow-based Large Diffusion Transformer for Flexible Resolution Generation

ICLR 2025
8
citations

Boosting Residual Networks with Group Knowledge

AAAI 2024arXiv
6
citations

ProAdvPrompter: A Two-Stage Journey to Effective Adversarial Prompting for LLMs

ICLR 2025
2
citations

GigaGS: 3D Gaussian Based Planar Representation for Large-Scene Surface Reconstruction

AAAI 2025
1
citations

Knowledge Adaptation for Efficient Semantic Segmentation

CVPR 2019
0
citations

GIF2Video: Color Dequantization and Temporal Interpolation of GIF Images

CVPR 2019
0
citations

Decoders Matter for Semantic Segmentation: Data-Dependent Decoding Enables Flexible Feature Aggregation

CVPR 2019
0
citations

GeoNet: Deep Geodesic Networks for Point Cloud Analysis

CVPR 2019
0
citations

ABCNet: Real-Time Scene Text Spotting With Adaptive Bezier-Curve Network

CVPR 2020arXiv
0
citations

DyCo3D: Robust Instance Segmentation of 3D Point Clouds Through Dynamic Convolution

CVPR 2021arXiv
0
citations

HCRF-Flow: Scene Flow From Point Clouds With Continuous High-Order CRFs and Position-Aware Flow Embedding

CVPR 2021
0
citations

GD-MAE: Generative Decoder for MAE Pre-Training on LiDAR Point Clouds

CVPR 2023
0
citations

PVT-SSD: Single-Stage 3D Object Detector With Point-Voxel Transformer

CVPR 2023
0
citations

MM-3DScene: 3D Scene Understanding by Customizing Masked Modeling With Informative-Preserved Reconstruction and Self-Distilled Consistency

CVPR 2023
0
citations

Crossing the Gap: Domain Generalization for Image Captioning

CVPR 2023
0
citations

Single Shot Text Detector With Regional Attention

ICCV 2017arXiv
0
citations

FCOS: Fully Convolutional One-Stage Object Detection

ICCV 2019
0
citations

Learning Hierarchical Graph Neural Networks for Image Clustering

ICCV 2021arXiv
0
citations

ARCH++: Animation-Ready Clothed Human Reconstruction Revisited

ICCV 2021
0
citations

Ponder: Point Cloud Pre-training via Neural Rendering

ICCV 2023arXiv
0
citations

Object-Centric Multiple Object Tracking

ICCV 2023arXiv
0
citations

Unsupervised Open-Vocabulary Object Localization in Videos

ICCV 2023arXiv
0
citations

Coarse-to-Fine Amodal Segmentation with Shape Prior

ICCV 2023arXiv
0
citations

Learning and Memorizing Representative Prototypes for 3D Point Cloud Semantic and Instance Segmentation

ECCV 2020
0
citations

Instance-Aware Embedding for Point Cloud Instance Segmentation

ECCV 2020
0
citations

PointInst3D: Segmenting 3D Instances by Points

ECCV 2022
0
citations

PSS: Progressive Sample Selection for Open-World Visual Representation Learning

ECCV 2022
0
citations

Rethinking Amodal Video Segmentation from Learning Supervised Signals with Object-centric Representation

ICCV 2023arXiv
0
citations

GoalFlow: Goal-Driven Flow Matching for Multimodal Trajectories Generation in End-to-End Autonomous Driving

CVPR 2025
0
citations

EgoAgent: A Joint Predictive Agent Model in Egocentric Worlds

ICCV 2025
0
citations

Frozen CLIP Transformer Is an Efficient Point Cloud Encoder

AAAI 2024
0
citations

Learning for Transductive Threshold Calibration in Open-World Recognition

CVPR 2024
0
citations

UniPAD: A Universal Pre-training Paradigm for Autonomous Driving

CVPR 2024
0
citations

Adaptive Slot Attention: Object Discovery with Dynamic Slot Number

CVPR 2024
0
citations

Point Transformer V3: Simpler Faster Stronger

CVPR 2024
0
citations

Sparse Autoencoders, Again?

ICML 2025
0
citations

An End-to-End TextSpotter With Explicit Alignment and Attention

CVPR 2018arXiv
0
citations

Bag of Tricks for Image Classification with Convolutional Neural Networks

CVPR 2019
0
citations

Geo-PIFu: Geometry and Pixel Aligned Implicit Functions for Single-view Human Reconstruction

NeurIPS 2020
0
citations

Progressive Coordinate Transforms for Monocular 3D Object Detection

NeurIPS 2021
0
citations

GRIN: Generative Relation and Intention Network for Multi-agent Trajectory Prediction

NeurIPS 2021
0
citations

Self-supervised Amodal Video Object Segmentation

NeurIPS 2022
0
citations

Learning Manifold Dimensions with Conditional Variational Autoencoders

NeurIPS 2022
0
citations