Jan Kautz
116
Papers
5,222
Total Citations
Papers (116)
Unsupervised Image-to-Image Translation Networks
NeurIPS 2017arXiv
2,892
citations
VILA: On Pre-training for Visual Language Models
CVPR 2024
685
citations
FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects
CVPR 2024
412
citations
Learning Affinity via Spatial Propagation Networks
NeurIPS 2017arXiv
300
citations
A Variational Perspective on Solving Inverse Problems with Diffusion Models
ICLR 2024
207
citations
Is Ego Status All You Need for Open-Loop End-to-End Autonomous Driving?
CVPR 2024
169
citations
Gated Delta Networks: Improving Mamba2 with Delta Rule
ICLR 2025
141
citations
FoundationStereo: Zero-Shot Stereo Matching
CVPR 2025
98
citations
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models
NeurIPS 2025
96
citations
GAvatar: Animatable 3D Gaussian Avatars with Implicit Mesh Learning
CVPR 2024
69
citations
One-Minute Video Generation with Test-Time Training
CVPR 2025
65
citations
Hymba: A Hybrid-head Architecture for Small Language Models
ICLR 2025arXiv
55
citations
Argus: Vision-Centric Reasoning with Grounded Chain-of-Thought
CVPR 2025
19
citations
LLaMaFlex: Many-in-one LLMs via Generalized Pruning and Weight Sharing
ICLR 2025
4
citations
HumanOLAT: A Large-Scale Dataset for Full-Body Human Relighting and Novel-View Synthesis
ICCV 2025
4
citations
Parallel Sequence Modeling via Generalized Spatial Propagation Network
CVPR 2025arXiv
3
citations
AdaHuman: Animatable Detailed 3D Human Generation with Compositional Multiview Diffusion
ICCV 2025
3
citations
Learning Superpixels With Segmentation-Aware Affinity Loss
CVPR 2018
0
citations
MoCoGAN: Decomposing Motion and Content for Video Generation
CVPR 2018arXiv
0
citations
Improving Landmark Localization With Semi-Supervised Learning
CVPR 2018arXiv
0
citations
SPLATNet: Sparse Lattice Networks for Point Cloud Processing
CVPR 2018arXiv
0
citations
Geometry-Aware Learning of Maps for Camera Localization
CVPR 2018arXiv
0
citations
Depth-Based 3D Hand Pose Estimation: From Current Achievements to Future Goals
CVPR 2018arXiv
0
citations
Making Convolutional Networks Recurrent for Visual Sequence Learning
CVPR 2018
0
citations
Deep Semantic Face Deblurring
CVPR 2018arXiv
0
citations
High-Resolution Image Synthesis and Semantic Manipulation With Conditional GANs
CVPR 2018arXiv
0
citations
PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume
CVPR 2018
0
citations
Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation
CVPR 2018arXiv
0
citations
STEP: Spatio-Temporal Progressive Learning for Video Action Detection
CVPR 2019
0
citations
SCOPS: Self-Supervised Co-Part Segmentation
CVPR 2019
0
citations
Joint Discriminative and Generative Learning for Person Re-Identification
CVPR 2019
0
citations
Learning Linear Transformations for Fast Image and Video Style Transfer
CVPR 2019
0
citations
PlaneRCNN: 3D Plane Detection and Reconstruction From a Single Image
CVPR 2019
0
citations
Neural RGB(r)D Sensing: Depth and Uncertainty From a Video Camera
CVPR 2019
0
citations
Pixel-Adaptive Convolutional Neural Networks
CVPR 2019
0
citations
Importance Estimation for Neural Network Pruning
CVPR 2019
0
citations
Putting Humans in a Scene: Learning Affordance in 3D Indoor Environments
CVPR 2019
0
citations
Bi3D: Stereo Depth Estimation via Binary Classifications
CVPR 2020arXiv
0
citations
Meshlet Priors for 3D Mesh Reconstruction
CVPR 2020arXiv
0
citations
Self-Supervised Viewpoint Learning From Image Collections
CVPR 2020arXiv
0
citations
Two-Shot Spatially-Varying BRDF and Shape Estimation
CVPR 2020arXiv
0
citations
Novel View Synthesis of Dynamic Scenes With Globally Coherent Depths From a Monocular Camera
CVPR 2020arXiv
0
citations
Weakly-Supervised 3D Human Pose Learning via Multi-View Images in the Wild
CVPR 2020arXiv
0
citations
Dreaming to Distill: Data-Free Knowledge Transfer via DeepInversion
CVPR 2020arXiv
0
citations
UNAS: Differentiable Architecture Search Meets Reinforcement Learning
CVPR 2020arXiv
0
citations
Instance-Aware, Context-Focused, and Memory-Efficient Weakly Supervised Object Detection
CVPR 2020arXiv
0
citations
Binary TTC: A Temporal Geofence for Autonomous Navigation
CVPR 2021arXiv
0
citations
Learning to Track Instances without Video Annotations
CVPR 2021arXiv
0
citations
Self-Supervised Learning on 3D Point Clouds by Learning Discrete Generative Models
CVPR 2021
0
citations
Weakly-Supervised Physically Unconstrained Gaze Estimation
CVPR 2021arXiv
0
citations
See Through Gradients: Image Batch Recovery via GradInversion
CVPR 2021arXiv
0
citations
DexYCB: A Benchmark for Capturing Hand Grasping of Objects
CVPR 2021arXiv
0
citations
FreeSOLO: Learning To Segment Objects Without Annotations
CVPR 2022arXiv
0
citations
CoordGAN: Self-Supervised Dense Correspondences Emerge From GANs
CVPR 2022arXiv
0
citations
GradViT: Gradient Inversion of Vision Transformers
CVPR 2022arXiv
0
citations
GLAMR: Global Occlusion-Aware Human Mesh Recovery With Dynamic Cameras
CVPR 2022arXiv
0
citations
GroupViT: Semantic Segmentation Emerges From Text Supervision
CVPR 2022arXiv
0
citations
A-ViT: Adaptive Tokens for Efficient Vision Transformer
CVPR 2022
0
citations
Zero-Shot Pose Transfer for Unrigged Stylized 3D Characters
CVPR 2023arXiv
0
citations
Global Vision Transformer Pruning With Hessian-Aware Saliency
CVPR 2023arXiv
0
citations
Recurrence Without Recurrence: Stable Video Landmark Detection With Deep Equilibrium Models
CVPR 2023arXiv
0
citations
BundleSDF: Neural 6-DoF Tracking and 3D Reconstruction of Unknown Objects
CVPR 2023
0
citations
Heterogeneous Continual Learning
CVPR 2023
0
citations
The Best Defense Is a Good Offense: Adversarial Augmentation Against Adversarial Attacks
CVPR 2023
0
citations
Robust Model-Based 3D Head Pose Estimation
ICCV 2015
0
citations
A Lightweight Approach for On-The-Fly Reflectance Estimation
ICCV 2017arXiv
0
citations
Intrinsic3D: High-Quality 3D Reconstruction by Joint Appearance and Geometry Optimization With Spatially-Varying Lighting
ICCV 2017arXiv
0
citations
Learning Propagation for Arbitrarily-Structured Data
ICCV 2019
0
citations
Unsupervised Video Interpolation Using Cycle Consistency
ICCV 2019
0
citations
SENSE: A Shared Encoder Network for Scene-Flow Estimation
ICCV 2019
0
citations
Extreme View Synthesis
ICCV 2019
0
citations
Neural Inverse Rendering of an Indoor Scene From a Single Image
ICCV 2019
0
citations
MambaVision: A Hybrid Mamba-Transformer Vision Backbone
CVPR 2025
0
citations
Few-Shot Unsupervised Image-to-Image Translation
ICCV 2019
0
citations
Learning Indoor Inverse Rendering With 3D Spatially-Varying Lighting
ICCV 2021arXiv
0
citations
Self-Supervised Object Detection via Generative Image Synthesis
ICCV 2021arXiv
0
citations
RANA: Relightable Articulated Neural Avatars
ICCV 2023arXiv
0
citations
PhysDiff: Physics-Guided Human Motion Diffusion Model
ICCV 2023arXiv
0
citations
Joint Disentangling and Adaptation for Cross-Domain Person Re-Identification
ECCV 2020
0
citations
Contrastive Learning for Weakly Supervised Phrase Grounding
ECCV 2020
0
citations
DeepGMR: Learning Latent Gaussian Mixture Models for Registration
ECCV 2020
0
citations
Self-supervised Single-view 3D Reconstruction via Semantic Consistency
ECCV 2020
0
citations
Weakly Supervised 3D Hand Pose Estimation via Biomechanical Constraints
ECCV 2020
0
citations
UFO²: A Unified Framework towards Omni-supervised Object Detection
ECCV 2020
0
citations
Neural Light Field Estimation for Street Scenes with Differentiable Virtual Object Insertion
ECCV 2022
0
citations
LANA: Latency Aware Network Acceleration
ECCV 2022
0
citations
Few-Shot Adaptive Gaze Estimation
ICCV 2019
0
citations
OmniDrive: A Holistic Vision-Language Dataset for Autonomous Driving with Counterfactual Reasoning
CVPR 2025
0
citations
Scaling Vision Pre-Training to 4K Resolution
CVPR 2025
0
citations
NVILA: Efficient Frontier Visual Language Models
CVPR 2025
0
citations
RADIOv2.5: Improved Baselines for Agglomerative Vision Foundation Models
CVPR 2025
0
citations
SimAvatar: Simulation-Ready Avatars with Layered Hair and Clothing
CVPR 2025
0
citations
Mosaic3D: Foundation Dataset and Model for Open-Vocabulary 3D Segmentation
CVPR 2025
0
citations
GENMO: A GENeralist Model for Human MOtion
ICCV 2025
0
citations
GeoMan: Temporally Consistent Human Geometry Estimation using Image-to-Video Diffusion
ICCV 2025
0
citations
COLMAP-Free 3D Gaussian Splatting
CVPR 2024
0
citations
AM-RADIO: Agglomerative Vision Foundation Model Reduce All Domains Into One
CVPR 2024
0
citations
Flextron: Many-in-One Flexible Large Language Model
ICML 2024
0
citations
Modeling Object Appearance Using Context-Conditioned Component Analysis
CVPR 2015
0
citations
Online Detection and Classification of Dynamic Hand Gestures With Recurrent 3D Convolutional Neural Network
CVPR 2016
0
citations
Accelerated Generative Models for 3D Point Cloud Data
CVPR 2016
0
citations
Dynamic Facial Analysis: From Bayesian Filtering to Recurrent Neural Network
CVPR 2017
0
citations
Polarimetric Multi-View Stereo
CVPR 2017
0
citations
Context-aware Synthesis and Placement of Object Instances
NeurIPS 2018
0
citations
Video-to-Video Synthesis
NeurIPS 2018
0
citations
Joint-task Self-supervised Learning for Temporal Correspondence
NeurIPS 2019
0
citations
Few-shot Video-to-Video Synthesis
NeurIPS 2019
0
citations
Dancing to Music
NeurIPS 2019
0
citations
Convolutional Tensor-Train LSTM for Spatio-Temporal Learning
NeurIPS 2020
0
citations
Online Adaptation for Consistent Mesh Reconstruction in the Wild
NeurIPS 2020
0
citations
NVAE: A Deep Hierarchical Variational Autoencoder
NeurIPS 2020
0
citations
A Contrastive Learning Approach for Training Variational Autoencoder Priors
NeurIPS 2021
0
citations
Coupled Segmentation and Edge Learning via Dynamic Graph Propagation
NeurIPS 2021
0
citations
Score-based Generative Modeling in Latent Space
NeurIPS 2021
0
citations
Generalizable One-shot 3D Neural Head Avatar
NeurIPS 2023
0
citations
Convolutional State Space Models for Long-Range Spatiotemporal Modeling
NeurIPS 2023
0
citations