Dan Xu

52
Papers
660
Total Citations

Papers (52)

GS-SLAM: Dense Visual SLAM with 3D Gaussian Splatting

CVPR 2024
359
citations

Learning Deep Structured Multi-Scale Features using Attention-Gated CRFs for Contour Prediction

NeurIPS 2017arXiv
132
citations

DetCLIPv3: Towards Versatile Generative Open-vocabulary Object Detection

CVPR 2024
45
citations

Learning 3D Geometry and Feature Consistent Gaussian Splatting for Object Removal

ECCV 2024
33
citations

Text-to-3D Generation with Bidirectional Diffusion using both 2D and 3D priors

CVPR 2024
28
citations

Interactive3D: Create What You Want by Interactive 3D Generation

CVPR 2024
16
citations

CVT-xRF: Contrastive In-Voxel Transformer for 3D Consistent Radiance Fields from Sparse Inputs

CVPR 2024
13
citations

Audio-visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head Generation

ICCV 2025
10
citations

From One to More: Contextual Part Latents for 3D Generation

ICCV 2025arXiv
8
citations

Synergizing Motion and Appearance: Multi-Scale Compensatory Codebooks for Talking Head Video Generation

CVPR 2025
7
citations

Efficient Multitask Dense Predictor via Binarization

CVPR 2024
6
citations

Vision-aware Multimodal Prompt Tuning for Uploadable Multi-source Few-shot Domain Adaptation

AAAI 2025
2
citations

Rep-MTL: Unleashing the Power of Representation-level Task Saliency for Multi-Task Learning

ICCV 2025
1
citations

PAD-Net: Multi-Tasks Guided Prediction-and-Distillation Network for Simultaneous Depth Estimation and Scene Parsing

CVPR 2018arXiv
0
citations

Structured Attention Guided Convolutional Neural Fields for Monocular Depth Estimation

CVPR 2018arXiv
0
citations

Every Smile Is Unique: Landmark-Guided Diverse Smile Generation

CVPR 2018arXiv
0
citations

Group Consistent Similarity Learning via Deep CRF for Person Re-Identification

CVPR 2018
0
citations

Multi-Channel Attention Selection GAN With Cascaded Semantic Guidance for Cross-View Image Translation

CVPR 2019
0
citations

Dynamic Graph Message Passing Networks

CVPR 2020arXiv
0
citations

Local Class-Specific and Global Image-Level Generative Adversarial Networks for Semantic-Guided Scene Generation

CVPR 2020arXiv
0
citations

Learning Parallel Dense Correspondence From Spatio-Temporal Descriptors for Efficient and Robust 4D Reconstruction

CVPR 2021arXiv
0
citations

Delving Into Localization Errors for Monocular 3D Object Detection

CVPR 2021arXiv
0
citations

Multi-Class Token Transformer for Weakly Supervised Semantic Segmentation

CVPR 2022arXiv
0
citations

Depth-Aware Generative Adversarial Network for Talking Head Video Generation

CVPR 2022arXiv
0
citations

Generalized Binary Search Network for Highly-Efficient Multi-View Stereo

CVPR 2022arXiv
0
citations

Learning Multi-Modal Class-Specific Tokens for Weakly Supervised Dense Object Localization

CVPR 2023
0
citations

DetCLIPv2: Scalable Open-Vocabulary Object Detection Pre-Training via Word-Region Alignment

CVPR 2023arXiv
0
citations

Free-viewpoint Human Animation with Pose-correlated Reference Selection

CVPR 2025
0
citations

Unsupervised Collaborative Learning of Keyframe Detection and Visual Odometry Towards Monocular Deep SLAM

ICCV 2019
0
citations

Leveraging Auxiliary Tasks With Affinity Learning for Weakly Supervised Semantic Segmentation

ICCV 2021arXiv
0
citations

SA-ConvONet: Sign-Agnostic Optimization of Convolutional Occupancy Networks

ICCV 2021
0
citations

Implicit Identity Representation Conditioned Memory Compensation Network for Talking Head video Generation

ICCV 2023arXiv
0
citations

TaskExpert: Dynamically Assembling Multi-Task Representations with Memorial Mixture-of-Experts

ICCV 2023arXiv
0
citations

Learning Unified Decompositional and Compositional NeRF for Editable Novel View Synthesis

ICCV 2023arXiv
0
citations

Network Binarization via Contrastive Learning

ECCV 2022
0
citations

Lipschitz Continuity Retained Binary Neural Network

ECCV 2022
0
citations

Inverted Pyramid Multi-task Transformer for Dense Scene Understanding

ECCV 2022
0
citations

Structured Modeling of Joint Deep Feature and Prediction Refinement for Salient Object Detection

ICCV 2019
0
citations

GaussHDR: High Dynamic Range Gaussian Splatting via Learning Unified 3D and 2D Local Tone Mapping

CVPR 2025
0
citations

Taming Video Diffusion Prior with Scene-Grounding Guidance for 3D Gaussian Splatting from Sparse Inputs

CVPR 2025
0
citations

Flow-NeRF: Joint Learning of Geometry, Poses, and Dense Flow within Unified Neural Representations

CVPR 2025
0
citations

DiGA3D: Coarse-to-Fine Diffusional Propagation of Geometry and Appearance for Versatile 3D Inpainting

ICCV 2025
0
citations

Multi-Attribute Multi-Grained Adaptation of Pre-Trained Language Models for Text Understanding from Bayesian Perspective

AAAI 2025
0
citations

Personalized LoRA for Human-Centered Text Understanding

AAAI 2024
0
citations

DiffusionMTL: Learning Multi-Task Denoising Diffusion Model from Partially Annotated Data

CVPR 2024
0
citations

Implicit Event-RGBD Neural SLAM

CVPR 2024
0
citations

UniMC: Taming Diffusion Transformer for Unified Keypoint-Guided Multi-Class Image Generation

ICML 2025
0
citations

Multi-Scale Continuous CRFs as Sequential Deep Networks for Monocular Depth Estimation

CVPR 2017arXiv
0
citations

Learning Cross-Modal Deep Representations for Robust Pedestrian Detection

CVPR 2017arXiv
0
citations

Viraliency: Pooling Local Virality

CVPR 2017arXiv
0
citations

DetCLIP: Dictionary-Enriched Visual-Concept Paralleled Pre-training for Open-world Detection

NeurIPS 2022
0
citations

CoDA: Collaborative Novel Box Discovery and Cross-modal Alignment for Open-vocabulary 3D Object Detection

NeurIPS 2023
0
citations