Jiajun Wu
106
Papers
3,708
Total Citations
1
Affiliations
Affiliations
Stanford University
Papers (106)
Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling
NeurIPS 2016arXiv
2,081
citations
MarrNet: 3D Shape Reconstruction via 2.5D Sketches
NeurIPS 2017arXiv
435
citations
Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks
NeurIPS 2016arXiv
417
citations
ULIP-2: Towards Scalable Multimodal Pre-training for 3D Understanding
CVPR 2024
192
citations
Self-Supervised Intrinsic Image Decomposition
NeurIPS 2017arXiv
141
citations
WonderWorld: Interactive 3D Scene Generation from a Single Image
CVPR 2025
120
citations
ZeroNVS: Zero-Shot 360-Degree View Synthesis from a Single Image
CVPR 2024
85
citations
LayoutVLM: Differentiable Optimization of 3D Layout via Vision-Language Models
CVPR 2025
44
citations
Learning the 3D Fauna of the Web
CVPR 2024
42
citations
Re-thinking Temporal Search for Long-Form Video Understanding
CVPR 2025
36
citations
Shape and Material from Sound
NeurIPS 2017
33
citations
The Scene Language: Representing Scenes with Programs, Words, and Embeddings
CVPR 2025
15
citations
BEHAVIOR Vision Suite: Customizable Dataset Generation via Simulation
CVPR 2024
14
citations
Language-Informed Visual Concept Learning
ICLR 2024
12
citations
FluidNexus: 3D Fluid Reconstruction and Prediction from a Single Video
CVPR 2025
11
citations
Ponymation: Learning Articulated 3D Animal Motions from Unlabeled Online Videos
ECCV 2024
10
citations
Naturally Supervised 3D Visual Grounding with Language-Regularized Concept Learners
CVPR 2024
9
citations
Birth and Death of a Rose
CVPR 2025
5
citations
PGC: Physics-Based Gaussian Cloth from a Single Pose
CVPR 2025
3
citations
Taming generative video models for zero-shot optical flow extraction
NeurIPS 2025
3
citations
Perspective Plane Program Induction From a Single Image
CVPR 2020arXiv
0
citations
End-to-End Optimization of Scene Layout
CVPR 2020arXiv
0
citations
Probabilistic Video Prediction From Noisy Data With a Posterior Confidence
CVPR 2020
0
citations
Hierarchical Motion Understanding via Motion Programs
CVPR 2021arXiv
0
citations
Repopulating Street Scenes
CVPR 2021arXiv
0
citations
KeypointDeformer: Unsupervised 3D Keypoint Discovery for Shape Control
CVPR 2021arXiv
0
citations
Pi-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis
CVPR 2021
0
citations
De-Rendering the World's Revolutionary Artefacts
CVPR 2021
0
citations
Rotationally Equivariant 3D Object Detection
CVPR 2022arXiv
0
citations
ObjectFolder 2.0: A Multisensory Object Dataset for Sim2Real Transfer
CVPR 2022
0
citations
Programmatic Concept Learning for Human Motion Description and Synthesis
CVPR 2022
0
citations
Revisiting the "Video" in Video-Language Understanding
CVPR 2022
0
citations
Ego-Body Pose Estimation via Ego-Head Pose Estimation
CVPR 2023arXiv
0
citations
NS3D: Neuro-Symbolic Grounding of 3D Objects and Relations
CVPR 2023arXiv
0
citations
Multi-Object Manipulation via Object-Centric Neural Scattering Functions
CVPR 2023
0
citations
Seeing a Rose in Five Thousand Ways
CVPR 2023arXiv
0
citations
Putting People in Their Place: Affordance-Aware Human Insertion Into Scenes
CVPR 2023arXiv
0
citations
3D Neural Field Generation Using Triplane Diffusion
CVPR 2023arXiv
0
citations
RealImpact: A Dataset of Impact Sound Fields for Real Objects
CVPR 2023
0
citations
Accidental Light Probes
CVPR 2023arXiv
0
citations
ULIP: Learning a Unified Representation of Language, Images, and Point Clouds for 3D Understanding
CVPR 2023
0
citations
The ObjectFolder Benchmark: Multisensory Learning With Neural and Real Objects
CVPR 2023
0
citations
CIRCLE: Capture in Rich Contextual Environments
CVPR 2023
0
citations
PyPose: A Library for Robot Learning With Physics-Based Optimization
CVPR 2023arXiv
0
citations
Generative Modeling of Audible Shapes for Object Perception
ICCV 2017
0
citations
Raster-To-Vector: Revisiting Floorplan Transformation
ICCV 2017
0
citations
Program-Guided Image Manipulators
ICCV 2019
0
citations
Neural Radiance Flow for 4D View Synthesis and Video Processing
ICCV 2021arXiv
0
citations
3D Shape Generation and Completion Through Point-Voxel Diffusion
ICCV 2021arXiv
0
citations
Learning Temporal Dynamics From Cycles in Narrated Video
ICCV 2021arXiv
0
citations
VQ3D: Learning a 3D-Aware Generative Model on ImageNet
ICCV 2023arXiv
0
citations
Tree-Structured Shading Decomposition
ICCV 2023arXiv
0
citations
Rendering Humans from Object-Occluded Monocular Videos
ICCV 2023arXiv
0
citations
Video Extrapolation in Space and Time
ECCV 2022
0
citations
Unsupervised Segmentation in Real-World Images via Spelke Object Inference
ECCV 2022
0
citations
Translating a Visual LEGO Manual to a Machine-Executable Plan
ECCV 2022
0
citations
Galileo: Perceiving Physical Object Properties by Integrating a Physics Engine with Deep Learning
NeurIPS 2015
0
citations
Learning to See Physics via Visual De-animation
NeurIPS 2017
0
citations
WonderJourney: Going from Anywhere to Everywhere
CVPR 2024
0
citations
Digital Twin Catalog: A Large-Scale Photorealistic 3D Object Digital Twin Dataset
CVPR 2025
0
citations
Lifting Motion to the 3D World via 2D Diffusion
CVPR 2025
0
citations
Category-Agnostic Neural Object Rigging
CVPR 2025
0
citations
Diffusion Self-Distillation for Zero-Shot Customized Image Generation
CVPR 2025
0
citations
X-Capture: An Open-Source Portable Device for Multi-Sensory Learning
ICCV 2025
0
citations
Weakly-Supervised Learning of Dense Functional Correspondences
ICCV 2025
0
citations
WonderPlay: Dynamic 3D Scene Generation from a Single Image and Actions
ICCV 2025
0
citations
Flow to the Mode: Mode-Seeking Diffusion Autoencoders for State-of-the-Art Image Tokenization
ICCV 2025
0
citations
WorldScore: Unified Evaluation Benchmark for World Generation
ICCV 2025
0
citations
HVAdam: A Full-Dimension Adaptive Optimizer
AAAI 2025
0
citations
SkyScript: A Large and Semantically Diverse Vision-Language Dataset for Remote Sensing
AAAI 2024
0
citations
Hearing Anything Anywhere
CVPR 2024
0
citations
Holodeck: Language Guided Generation of 3D Embodied AI Environments
CVPR 2024
0
citations
Tripod: Three Complementary Inductive Biases for Disentangled Representation Learning
ICML 2024
0
citations
Deep Multiple Instance Learning for Image Classification and Auto-Annotation
CVPR 2015
0
citations
Neural Scene De-Rendering
CVPR 2017
0
citations
Synthesizing 3D Shapes via Modeling Multi-View Depth Maps and Silhouettes With Deep Generative Networks
CVPR 2017
0
citations
Pix3D: Dataset and Methods for Single-Image 3D Shape Modeling
CVPR 2018arXiv
0
citations
Attention Clusters: Purely Attention Based Local Feature Integration for Video Classification
CVPR 2018arXiv
0
citations
Learning to Reconstruct Shapes from Unseen Classes
NeurIPS 2018
0
citations
Learning to Exploit Stability for 3D Scene Parsing
NeurIPS 2018
0
citations
Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding
NeurIPS 2018
0
citations
3D-Aware Scene Manipulation via Inverse Graphics
NeurIPS 2018
0
citations
Visual Object Networks: Image Generation with Disentangled 3D Representations
NeurIPS 2018
0
citations
Visual Concept-Metaconcept Learning
NeurIPS 2019
0
citations
Modeling Expectation Violation in Intuitive Physics with Coarse Probabilistic Object Representations
NeurIPS 2019
0
citations
Learning Physical Graph Representations from Visual Scenes
NeurIPS 2020
0
citations
Multi-Plane Program Induction with 3D Box Priors
NeurIPS 2020
0
citations
Grammar-Based Grounded Lexicon Learning
NeurIPS 2021
0
citations
MOMA-LRG: Language-Refined Graphs for Multi-Object Multi-Actor Activity Parsing
NeurIPS 2022
0
citations
CLEVRER-Humans: Describing Physical and Causal Events the Human Way
NeurIPS 2022
0
citations
E-MAPP: Efficient Multi-Agent Reinforcement Learning with Parallel Program Guidance
NeurIPS 2022
0
citations
Interaction Modeling with Multiplex Attention
NeurIPS 2022
0
citations
IKEA-Manual: Seeing Shape Assembly Step by Step
NeurIPS 2022
0
citations
Unsupervised Learning of Shape Programs with Repeatable Implicit Parts
NeurIPS 2022
0
citations
Geoclidean: Few-Shot Generalization in Euclidean Geometry
NeurIPS 2022
0
citations
Model-Based Control with Sparse Neural Dynamics
NeurIPS 2023
0
citations
3D Copy-Paste: Physically Plausible Object Insertion for Monocular 3D Detection
NeurIPS 2023
0
citations
What’s Left? Concept Grounding with Logic-Enhanced Foundation Models
NeurIPS 2023
0
citations
Siamese Masked Autoencoders
NeurIPS 2023
0
citations
Are These the Same Apple? Comparing Images Based on Object Intrinsics
NeurIPS 2023
0
citations
Disentanglement via Latent Quantization
NeurIPS 2023
0
citations
Stanford-ORB: A Real-World 3D Object Inverse Rendering Benchmark
NeurIPS 2023
0
citations
SoundCam: A Dataset for Finding Humans Using Room Acoustics
NeurIPS 2023
0
citations
Inferring Hybrid Neural Fluid Fields from Videos
NeurIPS 2023
0
citations
Holistic Evaluation of Text-to-Image Models
NeurIPS 2023
0
citations
Neurally-Guided Structure Inference
ICML 2019
0
citations