Jingdong Wang

28
Papers
193
Total Citations

Papers (28)

OpenHumanVid: A Large-Scale High-Quality Dataset for Enhancing Human-Centric Video Generation

CVPR 2025
40
citations

GP-NeRF: Generalized Perception NeRF for Context-Aware 3D Scene Understanding

CVPR 2024
28
citations

2382 SSMG: Spatial-Semantic Map Guided Diffusion Model for Free-Form Layout-to-Image Generation

AAAI 2024
26
citations

SEED: A Simple and Effective 3D DETR in Point Clouds

ECCV 2024
19
citations

Decoupled Pseudo-labeling for Semi-Supervised Monocular 3D Object Detection

CVPR 2024
19
citations

OPEN: Object-wise Position Embedding for Multi-view 3D Object Detection

ECCV 2024
12
citations

Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation

AAAI 2024arXiv
11
citations

A Multimodal, Multi-Task Adapting Framework for Video Action Recognition

AAAI 2024arXiv
8
citations

Re-HOLD: Video Hand Object Interaction Reenactment via adaptive Layout-instructed Diffusion Model

CVPR 2025arXiv
8
citations

Interpretable Face Anti-Spoofing: Enhancing Generalization with Multimodal Large Language Models

AAAI 2025
6
citations

SpotActor: Training-Free Layout-Controlled Consistent Image Generation

AAAI 2025
6
citations

Make Your ViT-based Multi-view 3D Detectors Faster via Token Compression

ECCV 2024arXiv
4
citations

Action Detail Matters: Refining Video Recognition with Local Action Queries

CVPR 2025
3
citations

AudCast: Audio-Driven Human Video Generation by Cascaded Diffusion Transformers

CVPR 2025
3
citations

VoxelSplat: Dynamic Gaussian Splatting as an Effective Loss for Occupancy and Flow Prediction

CVPR 2025
0
citations

Multi-Domain Incremental Learning for Face Presentation Attack Detection

AAAI 2024
0
citations

Are Images Indistinguishable to Humans Also Indistinguishable to Classifiers?

CVPR 2025
0
citations

VRP-SAM: SAM with Visual Reference Prompt

CVPR 2024
0
citations

Learning to Rematch Mismatched Pairs for Robust Cross-Modal Retrieval

CVPR 2024
0
citations

Forgery-aware Adaptive Transformer for Generalizable Synthetic Image Detection

CVPR 2024
0
citations

MS-DETR: Efficient DETR Training with Mixed Supervision

CVPR 2024
0
citations

BEVSpread: Spread Voxel Pooling for Bird’s-Eye-View Representation in Vision-based Roadside 3D Object Detection

CVPR 2024
0
citations

Low-Biased General Annotated Dataset Generation

CVPR 2025
0
citations

Towards Unified Multi-granularity Text Detection with Interactive Attention

ICML 2024
0
citations

Continual SFT Matches Multimodal RLHF with Negative Supervision

CVPR 2025
0
citations

Hallo3: Highly Dynamic and Realistic Portrait Image Animation with Video Diffusion Transformer

CVPR 2025
0
citations

Mobile Attention: Mobile-Friendly Linear-Attention for Vision Transformers

ICML 2024
0
citations

TexGarment: Consistent Garment UV Texture Generation via Efficient 3D Structure-Guided Diffusion Transformer

CVPR 2025
0
citations