Sanja Fidler

120

Papers

458

Total Citations

Papers (120)

GEN3C: 3D-Informed World-Consistent Video Generation with Precise Camera Control

XCube: Large-Scale 3D Generative Modeling using Sparse Voxel Hierarchies

Proximal Deep Structured Models

DIFIX3D+: Improving 3D Reconstructions with Single-Step Diffusion Models

Teaching Machines to Describe Images with Natural Language Feedback

NeurIPS 2017arXiv

3DiffTection: 3D Object Detection with Geometry-Aware Diffusion Features

Align Your Gaussians: Text-to-4D with Dynamic 3D Gaussians and Composed Diffusion Models

Outdoor Scene Extrapolation with Hierarchical Generative Cellular Automata

Align Your Steps: Optimizing Sampling Schedules in Diffusion Models

Neuroaesthetics in Fashion: Modeling the Perception of Fashionability

Real-Time Coarse-to-Fine Topologically Preserving Segmentation

Rent3D: Floor-Plan Priors for Monocular Layout Estimation

Holistic 3D Scene Understanding From a Single Geo-Tagged Image

segDeepM: Exploiting Segmentation and Context in Deep Neural Networks for Object Detection

Instance-Level Segmentation for Autonomous Driving With Deep Densely Connected MRFs

Monocular 3D Object Detection for Autonomous Driving

HD Maps: Fine-Grained Road Segmentation by Parsing Ground and Aerial Images

MovieQA: Understanding Stories in Movies Through Question-Answering

Scene Parsing Through ADE20K Dataset

Sports Field Localization via Deep Structured Models

Annotating Object Instances With a Polygon-RNN

Efficient Interactive Annotation of Segmentation Datasets With Polygon-RNN++

Learning to Act Properly: Predicting and Explaining Affordances From Images

SurfConv: Bridging 3D and 2D Convolution for RGBD Images

A Face-to-Face Neural Conversation Model

Now You Shake Me: Towards Automatic 4D Cinema

VirtualHome: Simulating Household Activities via Programs

Can Large Vision-Language Models Correct Semantic Grounding Errors By Themselves?

Fast Interactive Object Annotation With Curve-GCN

Creative Flow+ Dataset

Synthesizing Environment-Aware Activities via Activity Sketches

DARNet: Deep Active Ray Network for Building Segmentation

Object Instance Annotation With Deep Extreme Level Set Evolution

Action Recognition From Single Timestamp Supervision in Untrimmed Videos

Devil Is in the Edges: Learning Semantic Boundaries From Noisy Annotations

Learning to Simulate Dynamic Environments With GameGAN

Neural Data Server: A Large-Scale Search Engine for Transfer Learning Data

Auto-Tuning Structured Light by Optical Stochastic Gradient Descent

Learning to Evaluate Perception Models Using Planner-Centric Metrics

DriveGAN: Towards a Controllable High-Quality Neural Simulation

DatasetGAN: Efficient Labeled Data Factory With Minimal Human Effort

Semantic Segmentation With Generative Models: Semi-Supervised Learning and Strong Out-of-Domain Generalization

Towards Good Practices for Efficiently Annotating Large-Scale Image Classification Datasets

Neural Geometric Level of Detail: Real-Time Rendering With Implicit 3D Shapes

Neural Parts: Learning Expressive 3D Shape Abstractions With Invertible Neural Networks

Generating Useful Accident-Prone Driving Scenarios via a Learned Traffic Prior

AUV-Net: Learning Aligned UV Maps for Texture Transfer and Synthesis

Neural Fields As Learnable Kernels for 3D Reconstruction

Extracting Triangular 3D Models, Materials, and Lighting From Images

Frame Averaging for Equivariant Shape Space Learning

BigDatasetGAN: Synthesizing ImageNet With Pixel-Wise Annotations

Polymorphic-GAN: Generating Aligned Samples Across Multiple Domains With Learned Morph Maps

How Much More Data Do I Need? Estimating Requirements for Downstream Tasks

Trace and Pace: Controllable Pedestrian Animation via Guided Trajectory Diffusion

NeuralField-LDM: Scene Generation With Hierarchical Latent Diffusion Models

Magic3D: High-Resolution Text-to-3D Content Creation

Neural Kernel Surface Reconstruction

VoxFormer: Sparse Voxel Transformer for Camera-Based 3D Semantic Scene Completion

Neural Fields Meet Explicit Geometric Representations for Inverse Rendering of Urban Scenes

Align Your Latents: High-Resolution Video Synthesis With Latent Diffusion Models

Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books

Learning to Combine Mid-Level Cues for Object Proposal Generation

Enhancing Road Maps by Parsing Aerial Images Around the World

Monocular Object Instance Segmentation and Depth Ordering With CNNs

Lost Shopping! Monocular Localization in Large Indoor Spaces

Predicting Deep Zero-Shot Convolutional Neural Networks Using Textual Descriptions

Be Your Own Prada: Fashion Synthesis With Structural Coherence

Open Vocabulary Scene Parsing

Towards Diverse and Natural Image Descriptions via a Conditional GAN

TorontoCity: Seeing the World With a Million Eyes

SGN: Sequential Grouping Networks for Instance Segmentation

Situation Recognition With Graph Neural Networks

3D Graph Neural Networks for RGBD Semantic Segmentation

DMM-Net: Differentiable Mask-Matching Network for Video Object Segmentation

Neural Turtle Graphics for Modeling City Road Layouts

Meta-Sim: Learning to Generate Synthetic Datasets

Video Face Clustering With Unknown Number of Clusters

Gated-SCNN: Gated Shape CNNs for Semantic Segmentation

Learning to Caption Images Through a Lifetime by Asking Questions

Learning Indoor Inverse Rendering With 3D Spatially-Varying Lighting

Physics-Based Human Motion Estimation and Synthesis From Videos

3DStyleNet: Creating 3D Shapes With Geometric and Texture Style Variations

TexFusion: Synthesizing 3D Textures with Text-Guided Image Diffusion Models

ATT3D: Amortized Text-to-3D Object Synthesis

Neural LiDAR Fields for Novel View Synthesis

End-to-end 3D Tracking with Decoupled Queries

DreamTeacher: Pretraining Image Backbones with Deep Generative Models

Towards Viewpoint Robustness in Bird's Eye View Segmentation

Learning Human Dynamics in Autonomous Driving Scenarios

Beyond Fixed Grid: Learning Geometric Image Representation with a Deformable Grid

Expressive Telepresence via Modular Codec Avatars

ScribbleBox: Interactive Annotation Framework for Video Object Segmentation

Lift, Splat, Shoot: Encoding Images from Arbitrary Camera Rigs by Implicitly Unprojecting to 3D

Meta-Sim2: Unsupervised Learning of Scene Structure for Synthetic Data Generation

Interactive Annotation of 3D Object Geometry using 2D Scribbles

Neural Light Field Estimation for Street Scenes with Differentiable Virtual Object Insertion

MvDeCor: Multi-View Dense Correspondence Learning for Fine-Grained 3D Segmentation

3D Object Proposals for Accurate Object Class Detection

Skip-Thought Vectors

NeurIPS 2015arXiv

MovieGraphs: Towards Understanding Human-Centric Situations From Videos

Diffusion Renderer: Neural Inverse and Forward Rendering with Video Diffusion Models

PartField: Learning 3D Feature Fields for Part Segmentation and Beyond

Controllable Weather Synthesis and Removal with Video Diffusion Models

InfiniCube: Unbounded and Controllable Dynamic 3D Driving Scene Generation with World-Guided Video Models

A Neural Compositional Paradigm for Image Captioning

Learning to Predict 3D Objects with an Interpolation-based Differentiable Renderer

Learning Deformable Tetrahedral Meshes for 3D Reconstruction

Variational Amodal Object Completion

Towards Optimal Strategies for Training Self-Driving Perception Models in Simulation

Deep Marching Tetrahedra: a Hybrid Representation for High-Resolution 3D Shape Synthesis

Scalable Neural Data Server: A Data Recommender for Transfer Learning

ATISS: Autoregressive Transformers for Indoor Scene Synthesis

Don’t Generate Me: Training Differentially Private Generative Models with Sinkhorn Divergence

EditGAN: High-Precision Semantic Image Editing

DIB-R++: Learning to Predict Lighting and Material with a Hybrid Differentiable Renderer

LION: Latent Point Diffusion Models for 3D Shape Generation

EPIC-KITCHENS VISOR Benchmark: VIdeo Segmentations and Object Relations

Optimizing Data Collection for Machine Learning

GET3D: A Generative Model of High Quality 3D Textured Shapes Learned from Images

EigenDamage: Structured Pruning in the Kronecker-Factored Eigenbasis