Gang Yu
37
Papers
257
Total Citations
Papers (37)
Paint3D: Paint Anything 3D with Lighting-Less Texture Diffusion Models
CVPR 2024
108
citations
MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers
ICLR 2025arXiv
101
citations
KRIS-Bench: Benchmarking Next-Level Intelligent Image Editing Models
NeurIPS 2025
23
citations
MVPaint: Synchronized Multi-View Diffusion for Painting Anything 3D
CVPR 2025arXiv
19
citations
DeRS: Towards Extremely Efficient Upcycled Mixture-of-Experts Models
CVPR 2025arXiv
6
citations
Large Kernel Matters -- Improve Semantic Segmentation by Global Convolutional Network
CVPR 2017arXiv
0
citations
Learning a Discriminative Feature Network for Semantic Segmentation
CVPR 2018arXiv
0
citations
MegDet: A Large Mini-Batch Object Detector
CVPR 2018arXiv
0
citations
Cascaded Pyramid Network for Multi-Person Pose Estimation
CVPR 2018arXiv
0
citations
Modeling Local Geometric Structure of 3D Point Clouds Using Geo-CNN
CVPR 2019
0
citations
An End-To-End Network for Panoptic Segmentation
CVPR 2019
0
citations
Shape Robust Text Detection With Progressive Scale Expansion Network
CVPR 2019
0
citations
TACNet: Transition-Aware Context Network for Spatio-Temporal Action Detection
CVPR 2019
0
citations
State-Aware Tracker for Real-Time Video Object Segmentation
CVPR 2020arXiv
0
citations
Context Prior for Scene Segmentation
CVPR 2020arXiv
0
citations
TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentation
CVPR 2022arXiv
0
citations
Executing Your Commands via Motion Diffusion in Latent Space
CVPR 2023arXiv
0
citations
End-to-End 3D Dense Captioning With Vote2Cap-DETR
CVPR 2023arXiv
0
citations
STAR Loss: Reducing Semantic Ambiguity in Facial Landmark Detection
CVPR 2023
0
citations
ThunderNet: Towards Real-Time Generic Object Detection on Mobile Devices
ICCV 2019
0
citations
Objects365: A Large-Scale, High-Quality Dataset for Object Detection
ICCV 2019
0
citations
Efficient and Accurate Arbitrary-Shaped Text Detection With Pixel Aggregation Network
ICCV 2019
0
citations
Metric3D: Towards Zero-shot Metric 3D Prediction from A Single Image
ICCV 2023arXiv
0
citations
Robust Geometry-Preserving Depth Estimation Using Differentiable Rendering
ICCV 2023arXiv
0
citations
A Large-Scale Outdoor Multi-Modal Dataset and Benchmark for Novel View Synthesis and Implicit Scene Reconstruction
ICCV 2023arXiv
0
citations
D&D: Learning Human Dynamics from Dynamic Camera
ECCV 2022
0
citations
High-Order Information Matters: Learning Relation and Topology for Occluded Person Re-Identification
CVPR 2020arXiv
0
citations
PM-INR: Prior-Rich Multi-Modal Implicit Large-Scale Scene Neural Representation
AAAI 2024
0
citations
IT3D: Improved Text-to-3D Generation with Explicit View Synthesis
AAAI 2024arXiv
0
citations
LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding Reasoning and Planning
CVPR 2024
0
citations
Fast Action Proposals for Human Action Detection and Search
CVPR 2015
0
citations
Learnable Tree Filter for Structure-preserving Feature Transform
NeurIPS 2019
0
citations
Coordinates Are NOT Lonely - Codebook Prior Helps Implicit Neural 3D representations
NeurIPS 2022
0
citations
Hierarchical Normalization for Robust Monocular Depth Estimation
NeurIPS 2022
0
citations
PDF: Point Diffusion Implicit Function for Large-scale Scene Neural Representation
NeurIPS 2023
0
citations
MotionGPT: Human Motion as a Foreign Language
NeurIPS 2023
0
citations
Michelangelo: Conditional 3D Shape Generation based on Shape-Image-Text Aligned Latent Representation
NeurIPS 2023
0
citations