Jan Kautz

116
Papers
5,222
Total Citations

Papers (116)

Unsupervised Image-to-Image Translation Networks

NeurIPS 2017arXiv
2,892
citations

VILA: On Pre-training for Visual Language Models

CVPR 2024
685
citations

FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects

CVPR 2024
412
citations

Learning Affinity via Spatial Propagation Networks

NeurIPS 2017arXiv
300
citations

A Variational Perspective on Solving Inverse Problems with Diffusion Models

ICLR 2024
207
citations

Is Ego Status All You Need for Open-Loop End-to-End Autonomous Driving?

CVPR 2024
169
citations

Gated Delta Networks: Improving Mamba2 with Delta Rule

ICLR 2025
141
citations

FoundationStereo: Zero-Shot Stereo Matching

CVPR 2025
98
citations

ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

NeurIPS 2025
96
citations

GAvatar: Animatable 3D Gaussian Avatars with Implicit Mesh Learning

CVPR 2024
69
citations

One-Minute Video Generation with Test-Time Training

CVPR 2025
65
citations

Hymba: A Hybrid-head Architecture for Small Language Models

ICLR 2025arXiv
55
citations

Argus: Vision-Centric Reasoning with Grounded Chain-of-Thought

CVPR 2025
19
citations

LLaMaFlex: Many-in-one LLMs via Generalized Pruning and Weight Sharing

ICLR 2025
4
citations

HumanOLAT: A Large-Scale Dataset for Full-Body Human Relighting and Novel-View Synthesis

ICCV 2025
4
citations

Parallel Sequence Modeling via Generalized Spatial Propagation Network

CVPR 2025arXiv
3
citations

AdaHuman: Animatable Detailed 3D Human Generation with Compositional Multiview Diffusion

ICCV 2025
3
citations

Learning Superpixels With Segmentation-Aware Affinity Loss

CVPR 2018
0
citations

MoCoGAN: Decomposing Motion and Content for Video Generation

CVPR 2018arXiv
0
citations

Improving Landmark Localization With Semi-Supervised Learning

CVPR 2018arXiv
0
citations

SPLATNet: Sparse Lattice Networks for Point Cloud Processing

CVPR 2018arXiv
0
citations

Geometry-Aware Learning of Maps for Camera Localization

CVPR 2018arXiv
0
citations

Depth-Based 3D Hand Pose Estimation: From Current Achievements to Future Goals

CVPR 2018arXiv
0
citations

Making Convolutional Networks Recurrent for Visual Sequence Learning

CVPR 2018
0
citations

Deep Semantic Face Deblurring

CVPR 2018arXiv
0
citations

High-Resolution Image Synthesis and Semantic Manipulation With Conditional GANs

CVPR 2018arXiv
0
citations

PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume

CVPR 2018
0
citations

Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation

CVPR 2018arXiv
0
citations

STEP: Spatio-Temporal Progressive Learning for Video Action Detection

CVPR 2019
0
citations

SCOPS: Self-Supervised Co-Part Segmentation

CVPR 2019
0
citations

Joint Discriminative and Generative Learning for Person Re-Identification

CVPR 2019
0
citations

Learning Linear Transformations for Fast Image and Video Style Transfer

CVPR 2019
0
citations

PlaneRCNN: 3D Plane Detection and Reconstruction From a Single Image

CVPR 2019
0
citations

Neural RGB(r)D Sensing: Depth and Uncertainty From a Video Camera

CVPR 2019
0
citations

Pixel-Adaptive Convolutional Neural Networks

CVPR 2019
0
citations

Importance Estimation for Neural Network Pruning

CVPR 2019
0
citations

Putting Humans in a Scene: Learning Affordance in 3D Indoor Environments

CVPR 2019
0
citations

Bi3D: Stereo Depth Estimation via Binary Classifications

CVPR 2020arXiv
0
citations

Meshlet Priors for 3D Mesh Reconstruction

CVPR 2020arXiv
0
citations

Self-Supervised Viewpoint Learning From Image Collections

CVPR 2020arXiv
0
citations

Two-Shot Spatially-Varying BRDF and Shape Estimation

CVPR 2020arXiv
0
citations

Novel View Synthesis of Dynamic Scenes With Globally Coherent Depths From a Monocular Camera

CVPR 2020arXiv
0
citations

Weakly-Supervised 3D Human Pose Learning via Multi-View Images in the Wild

CVPR 2020arXiv
0
citations

Dreaming to Distill: Data-Free Knowledge Transfer via DeepInversion

CVPR 2020arXiv
0
citations

UNAS: Differentiable Architecture Search Meets Reinforcement Learning

CVPR 2020arXiv
0
citations

Instance-Aware, Context-Focused, and Memory-Efficient Weakly Supervised Object Detection

CVPR 2020arXiv
0
citations

Binary TTC: A Temporal Geofence for Autonomous Navigation

CVPR 2021arXiv
0
citations

Learning to Track Instances without Video Annotations

CVPR 2021arXiv
0
citations

Self-Supervised Learning on 3D Point Clouds by Learning Discrete Generative Models

CVPR 2021
0
citations

Weakly-Supervised Physically Unconstrained Gaze Estimation

CVPR 2021arXiv
0
citations

See Through Gradients: Image Batch Recovery via GradInversion

CVPR 2021arXiv
0
citations

DexYCB: A Benchmark for Capturing Hand Grasping of Objects

CVPR 2021arXiv
0
citations

FreeSOLO: Learning To Segment Objects Without Annotations

CVPR 2022arXiv
0
citations

CoordGAN: Self-Supervised Dense Correspondences Emerge From GANs

CVPR 2022arXiv
0
citations

GradViT: Gradient Inversion of Vision Transformers

CVPR 2022arXiv
0
citations

GLAMR: Global Occlusion-Aware Human Mesh Recovery With Dynamic Cameras

CVPR 2022arXiv
0
citations

GroupViT: Semantic Segmentation Emerges From Text Supervision

CVPR 2022arXiv
0
citations

A-ViT: Adaptive Tokens for Efficient Vision Transformer

CVPR 2022
0
citations

Zero-Shot Pose Transfer for Unrigged Stylized 3D Characters

CVPR 2023arXiv
0
citations

Global Vision Transformer Pruning With Hessian-Aware Saliency

CVPR 2023arXiv
0
citations

Recurrence Without Recurrence: Stable Video Landmark Detection With Deep Equilibrium Models

CVPR 2023arXiv
0
citations

BundleSDF: Neural 6-DoF Tracking and 3D Reconstruction of Unknown Objects

CVPR 2023
0
citations

Heterogeneous Continual Learning

CVPR 2023
0
citations

The Best Defense Is a Good Offense: Adversarial Augmentation Against Adversarial Attacks

CVPR 2023
0
citations

Robust Model-Based 3D Head Pose Estimation

ICCV 2015
0
citations

A Lightweight Approach for On-The-Fly Reflectance Estimation

ICCV 2017arXiv
0
citations

Intrinsic3D: High-Quality 3D Reconstruction by Joint Appearance and Geometry Optimization With Spatially-Varying Lighting

ICCV 2017arXiv
0
citations

Learning Propagation for Arbitrarily-Structured Data

ICCV 2019
0
citations

Unsupervised Video Interpolation Using Cycle Consistency

ICCV 2019
0
citations

SENSE: A Shared Encoder Network for Scene-Flow Estimation

ICCV 2019
0
citations

Extreme View Synthesis

ICCV 2019
0
citations

Neural Inverse Rendering of an Indoor Scene From a Single Image

ICCV 2019
0
citations

MambaVision: A Hybrid Mamba-Transformer Vision Backbone

CVPR 2025
0
citations

Few-Shot Unsupervised Image-to-Image Translation

ICCV 2019
0
citations

Learning Indoor Inverse Rendering With 3D Spatially-Varying Lighting

ICCV 2021arXiv
0
citations

Self-Supervised Object Detection via Generative Image Synthesis

ICCV 2021arXiv
0
citations

RANA: Relightable Articulated Neural Avatars

ICCV 2023arXiv
0
citations

PhysDiff: Physics-Guided Human Motion Diffusion Model

ICCV 2023arXiv
0
citations

Joint Disentangling and Adaptation for Cross-Domain Person Re-Identification

ECCV 2020
0
citations

Contrastive Learning for Weakly Supervised Phrase Grounding

ECCV 2020
0
citations

DeepGMR: Learning Latent Gaussian Mixture Models for Registration

ECCV 2020
0
citations

Self-supervised Single-view 3D Reconstruction via Semantic Consistency

ECCV 2020
0
citations

Weakly Supervised 3D Hand Pose Estimation via Biomechanical Constraints

ECCV 2020
0
citations

UFO²: A Unified Framework towards Omni-supervised Object Detection

ECCV 2020
0
citations

Neural Light Field Estimation for Street Scenes with Differentiable Virtual Object Insertion

ECCV 2022
0
citations

LANA: Latency Aware Network Acceleration

ECCV 2022
0
citations

Few-Shot Adaptive Gaze Estimation

ICCV 2019
0
citations

OmniDrive: A Holistic Vision-Language Dataset for Autonomous Driving with Counterfactual Reasoning

CVPR 2025
0
citations

Scaling Vision Pre-Training to 4K Resolution

CVPR 2025
0
citations

NVILA: Efficient Frontier Visual Language Models

CVPR 2025
0
citations

RADIOv2.5: Improved Baselines for Agglomerative Vision Foundation Models

CVPR 2025
0
citations

SimAvatar: Simulation-Ready Avatars with Layered Hair and Clothing

CVPR 2025
0
citations

Mosaic3D: Foundation Dataset and Model for Open-Vocabulary 3D Segmentation

CVPR 2025
0
citations

GENMO: A GENeralist Model for Human MOtion

ICCV 2025
0
citations

GeoMan: Temporally Consistent Human Geometry Estimation using Image-to-Video Diffusion

ICCV 2025
0
citations

COLMAP-Free 3D Gaussian Splatting

CVPR 2024
0
citations

AM-RADIO: Agglomerative Vision Foundation Model Reduce All Domains Into One

CVPR 2024
0
citations

Flextron: Many-in-One Flexible Large Language Model

ICML 2024
0
citations

Modeling Object Appearance Using Context-Conditioned Component Analysis

CVPR 2015
0
citations

Online Detection and Classification of Dynamic Hand Gestures With Recurrent 3D Convolutional Neural Network

CVPR 2016
0
citations

Accelerated Generative Models for 3D Point Cloud Data

CVPR 2016
0
citations

Dynamic Facial Analysis: From Bayesian Filtering to Recurrent Neural Network

CVPR 2017
0
citations

Polarimetric Multi-View Stereo

CVPR 2017
0
citations

Context-aware Synthesis and Placement of Object Instances

NeurIPS 2018
0
citations

Video-to-Video Synthesis

NeurIPS 2018
0
citations

Joint-task Self-supervised Learning for Temporal Correspondence

NeurIPS 2019
0
citations

Few-shot Video-to-Video Synthesis

NeurIPS 2019
0
citations

Dancing to Music

NeurIPS 2019
0
citations

Convolutional Tensor-Train LSTM for Spatio-Temporal Learning

NeurIPS 2020
0
citations

Online Adaptation for Consistent Mesh Reconstruction in the Wild

NeurIPS 2020
0
citations

NVAE: A Deep Hierarchical Variational Autoencoder

NeurIPS 2020
0
citations

A Contrastive Learning Approach for Training Variational Autoencoder Priors

NeurIPS 2021
0
citations

Coupled Segmentation and Edge Learning via Dynamic Graph Propagation

NeurIPS 2021
0
citations

Score-based Generative Modeling in Latent Space

NeurIPS 2021
0
citations

Generalizable One-shot 3D Neural Head Avatar

NeurIPS 2023
0
citations

Convolutional State Space Models for Long-Range Spatiotemporal Modeling

NeurIPS 2023
0
citations