Sanja Fidler

120
Papers
458
Total Citations

Papers (120)

GEN3C: 3D-Informed World-Consistent Video Generation with Precise Camera Control

CVPR 2025arXiv
138
citations

XCube: Large-Scale 3D Generative Modeling using Sparse Voxel Hierarchies

CVPR 2024
127
citations

Proximal Deep Structured Models

NeurIPS 2016
88
citations

DIFIX3D+: Improving 3D Reconstructions with Single-Step Diffusion Models

CVPR 2025
59
citations

Teaching Machines to Describe Images with Natural Language Feedback

NeurIPS 2017arXiv
46
citations

3DiffTection: 3D Object Detection with Geometry-Aware Diffusion Features

CVPR 2024
0
citations

Align Your Gaussians: Text-to-4D with Dynamic 3D Gaussians and Composed Diffusion Models

CVPR 2024
0
citations

Outdoor Scene Extrapolation with Hierarchical Generative Cellular Automata

CVPR 2024
0
citations

Align Your Steps: Optimizing Sampling Schedules in Diffusion Models

ICML 2024
0
citations

Neuroaesthetics in Fashion: Modeling the Perception of Fashionability

CVPR 2015
0
citations

Real-Time Coarse-to-Fine Topologically Preserving Segmentation

CVPR 2015
0
citations

Rent3D: Floor-Plan Priors for Monocular Layout Estimation

CVPR 2015
0
citations

Holistic 3D Scene Understanding From a Single Geo-Tagged Image

CVPR 2015
0
citations

segDeepM: Exploiting Segmentation and Context in Deep Neural Networks for Object Detection

CVPR 2015
0
citations

Instance-Level Segmentation for Autonomous Driving With Deep Densely Connected MRFs

CVPR 2016
0
citations

Monocular 3D Object Detection for Autonomous Driving

CVPR 2016
0
citations

HD Maps: Fine-Grained Road Segmentation by Parsing Ground and Aerial Images

CVPR 2016
0
citations

MovieQA: Understanding Stories in Movies Through Question-Answering

CVPR 2016
0
citations

Scene Parsing Through ADE20K Dataset

CVPR 2017
0
citations

Sports Field Localization via Deep Structured Models

CVPR 2017
0
citations

Annotating Object Instances With a Polygon-RNN

CVPR 2017arXiv
0
citations

Efficient Interactive Annotation of Segmentation Datasets With Polygon-RNN++

CVPR 2018arXiv
0
citations

Learning to Act Properly: Predicting and Explaining Affordances From Images

CVPR 2018arXiv
0
citations

SurfConv: Bridging 3D and 2D Convolution for RGBD Images

CVPR 2018arXiv
0
citations

A Face-to-Face Neural Conversation Model

CVPR 2018arXiv
0
citations

Now You Shake Me: Towards Automatic 4D Cinema

CVPR 2018
0
citations

VirtualHome: Simulating Household Activities via Programs

CVPR 2018arXiv
0
citations

Can Large Vision-Language Models Correct Semantic Grounding Errors By Themselves?

CVPR 2025
0
citations

Fast Interactive Object Annotation With Curve-GCN

CVPR 2019arXiv
0
citations

Creative Flow+ Dataset

CVPR 2019
0
citations

Synthesizing Environment-Aware Activities via Activity Sketches

CVPR 2019
0
citations

DARNet: Deep Active Ray Network for Building Segmentation

CVPR 2019
0
citations

Object Instance Annotation With Deep Extreme Level Set Evolution

CVPR 2019
0
citations

Action Recognition From Single Timestamp Supervision in Untrimmed Videos

CVPR 2019
0
citations

Devil Is in the Edges: Learning Semantic Boundaries From Noisy Annotations

CVPR 2019
0
citations

Learning to Simulate Dynamic Environments With GameGAN

CVPR 2020arXiv
0
citations

Neural Data Server: A Large-Scale Search Engine for Transfer Learning Data

CVPR 2020arXiv
0
citations

Auto-Tuning Structured Light by Optical Stochastic Gradient Descent

CVPR 2020
0
citations

Learning to Evaluate Perception Models Using Planner-Centric Metrics

CVPR 2020arXiv
0
citations

DriveGAN: Towards a Controllable High-Quality Neural Simulation

CVPR 2021arXiv
0
citations

DatasetGAN: Efficient Labeled Data Factory With Minimal Human Effort

CVPR 2021arXiv
0
citations

Semantic Segmentation With Generative Models: Semi-Supervised Learning and Strong Out-of-Domain Generalization

CVPR 2021arXiv
0
citations

Towards Good Practices for Efficiently Annotating Large-Scale Image Classification Datasets

CVPR 2021arXiv
0
citations

Neural Geometric Level of Detail: Real-Time Rendering With Implicit 3D Shapes

CVPR 2021arXiv
0
citations

Neural Parts: Learning Expressive 3D Shape Abstractions With Invertible Neural Networks

CVPR 2021arXiv
0
citations

Generating Useful Accident-Prone Driving Scenarios via a Learned Traffic Prior

CVPR 2022arXiv
0
citations

AUV-Net: Learning Aligned UV Maps for Texture Transfer and Synthesis

CVPR 2022
0
citations

Neural Fields As Learnable Kernels for 3D Reconstruction

CVPR 2022arXiv
0
citations

Extracting Triangular 3D Models, Materials, and Lighting From Images

CVPR 2022arXiv
0
citations

Frame Averaging for Equivariant Shape Space Learning

CVPR 2022arXiv
0
citations

BigDatasetGAN: Synthesizing ImageNet With Pixel-Wise Annotations

CVPR 2022arXiv
0
citations

Polymorphic-GAN: Generating Aligned Samples Across Multiple Domains With Learned Morph Maps

CVPR 2022
0
citations

How Much More Data Do I Need? Estimating Requirements for Downstream Tasks

CVPR 2022
0
citations

Trace and Pace: Controllable Pedestrian Animation via Guided Trajectory Diffusion

CVPR 2023arXiv
0
citations

NeuralField-LDM: Scene Generation With Hierarchical Latent Diffusion Models

CVPR 2023
0
citations

Magic3D: High-Resolution Text-to-3D Content Creation

CVPR 2023arXiv
0
citations

Neural Kernel Surface Reconstruction

CVPR 2023
0
citations

VoxFormer: Sparse Voxel Transformer for Camera-Based 3D Semantic Scene Completion

CVPR 2023arXiv
0
citations

Neural Fields Meet Explicit Geometric Representations for Inverse Rendering of Urban Scenes

CVPR 2023arXiv
0
citations

Align Your Latents: High-Resolution Video Synthesis With Latent Diffusion Models

CVPR 2023arXiv
0
citations

Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books

ICCV 2015
0
citations

Learning to Combine Mid-Level Cues for Object Proposal Generation

ICCV 2015
0
citations

Enhancing Road Maps by Parsing Aerial Images Around the World

ICCV 2015
0
citations

Monocular Object Instance Segmentation and Depth Ordering With CNNs

ICCV 2015
0
citations

Lost Shopping! Monocular Localization in Large Indoor Spaces

ICCV 2015
0
citations

Predicting Deep Zero-Shot Convolutional Neural Networks Using Textual Descriptions

ICCV 2015
0
citations

Be Your Own Prada: Fashion Synthesis With Structural Coherence

ICCV 2017arXiv
0
citations

Open Vocabulary Scene Parsing

ICCV 2017arXiv
0
citations

Towards Diverse and Natural Image Descriptions via a Conditional GAN

ICCV 2017arXiv
0
citations

TorontoCity: Seeing the World With a Million Eyes

ICCV 2017arXiv
0
citations

SGN: Sequential Grouping Networks for Instance Segmentation

ICCV 2017
0
citations

Situation Recognition With Graph Neural Networks

ICCV 2017arXiv
0
citations

3D Graph Neural Networks for RGBD Semantic Segmentation

ICCV 2017
0
citations

DMM-Net: Differentiable Mask-Matching Network for Video Object Segmentation

ICCV 2019
0
citations

Neural Turtle Graphics for Modeling City Road Layouts

ICCV 2019
0
citations

Meta-Sim: Learning to Generate Synthetic Datasets

ICCV 2019
0
citations

Video Face Clustering With Unknown Number of Clusters

ICCV 2019
0
citations

Gated-SCNN: Gated Shape CNNs for Semantic Segmentation

ICCV 2019
0
citations

Learning to Caption Images Through a Lifetime by Asking Questions

ICCV 2019
0
citations

Learning Indoor Inverse Rendering With 3D Spatially-Varying Lighting

ICCV 2021arXiv
0
citations

Physics-Based Human Motion Estimation and Synthesis From Videos

ICCV 2021arXiv
0
citations

3DStyleNet: Creating 3D Shapes With Geometric and Texture Style Variations

ICCV 2021arXiv
0
citations

TexFusion: Synthesizing 3D Textures with Text-Guided Image Diffusion Models

ICCV 2023
0
citations

ATT3D: Amortized Text-to-3D Object Synthesis

ICCV 2023arXiv
0
citations

Neural LiDAR Fields for Novel View Synthesis

ICCV 2023arXiv
0
citations

End-to-end 3D Tracking with Decoupled Queries

ICCV 2023
0
citations

DreamTeacher: Pretraining Image Backbones with Deep Generative Models

ICCV 2023arXiv
0
citations

Towards Viewpoint Robustness in Bird's Eye View Segmentation

ICCV 2023
0
citations

Learning Human Dynamics in Autonomous Driving Scenarios

ICCV 2023
0
citations

Beyond Fixed Grid: Learning Geometric Image Representation with a Deformable Grid

ECCV 2020
0
citations

Expressive Telepresence via Modular Codec Avatars

ECCV 2020
0
citations

ScribbleBox: Interactive Annotation Framework for Video Object Segmentation

ECCV 2020
0
citations

Lift, Splat, Shoot: Encoding Images from Arbitrary Camera Rigs by Implicitly Unprojecting to 3D

ECCV 2020
0
citations

Meta-Sim2: Unsupervised Learning of Scene Structure for Synthetic Data Generation

ECCV 2020
0
citations

Interactive Annotation of 3D Object Geometry using 2D Scribbles

ECCV 2020
0
citations

Neural Light Field Estimation for Street Scenes with Differentiable Virtual Object Insertion

ECCV 2022
0
citations

MvDeCor: Multi-View Dense Correspondence Learning for Fine-Grained 3D Segmentation

ECCV 2022
0
citations

3D Object Proposals for Accurate Object Class Detection

NeurIPS 2015
0
citations

Skip-Thought Vectors

NeurIPS 2015arXiv
0
citations

MovieGraphs: Towards Understanding Human-Centric Situations From Videos

CVPR 2018arXiv
0
citations

Diffusion Renderer: Neural Inverse and Forward Rendering with Video Diffusion Models

CVPR 2025
0
citations

PartField: Learning 3D Feature Fields for Part Segmentation and Beyond

ICCV 2025
0
citations

Controllable Weather Synthesis and Removal with Video Diffusion Models

ICCV 2025
0
citations

InfiniCube: Unbounded and Controllable Dynamic 3D Driving Scene Generation with World-Guided Video Models

ICCV 2025
0
citations

A Neural Compositional Paradigm for Image Captioning

NeurIPS 2018
0
citations

Learning to Predict 3D Objects with an Interpolation-based Differentiable Renderer

NeurIPS 2019
0
citations

Learning Deformable Tetrahedral Meshes for 3D Reconstruction

NeurIPS 2020
0
citations

Variational Amodal Object Completion

NeurIPS 2020
0
citations

Towards Optimal Strategies for Training Self-Driving Perception Models in Simulation

NeurIPS 2021
0
citations

Deep Marching Tetrahedra: a Hybrid Representation for High-Resolution 3D Shape Synthesis

NeurIPS 2021
0
citations

Scalable Neural Data Server: A Data Recommender for Transfer Learning

NeurIPS 2021
0
citations

ATISS: Autoregressive Transformers for Indoor Scene Synthesis

NeurIPS 2021
0
citations

Don’t Generate Me: Training Differentially Private Generative Models with Sinkhorn Divergence

NeurIPS 2021
0
citations

EditGAN: High-Precision Semantic Image Editing

NeurIPS 2021
0
citations

DIB-R++: Learning to Predict Lighting and Material with a Hybrid Differentiable Renderer

NeurIPS 2021
0
citations

LION: Latent Point Diffusion Models for 3D Shape Generation

NeurIPS 2022
0
citations

EPIC-KITCHENS VISOR Benchmark: VIdeo Segmentations and Object Relations

NeurIPS 2022
0
citations

Optimizing Data Collection for Machine Learning

NeurIPS 2022
0
citations

GET3D: A Generative Model of High Quality 3D Textured Shapes Learned from Images

NeurIPS 2022
0
citations

EigenDamage: Structured Pruning in the Kronecker-Factored Eigenbasis

ICML 2019
0
citations