Jose M. Alvarez

38

Papers

170

Total Citations

Papers (38)

Is Ego Status All You Need for Open-Loop End-to-End Autonomous Driving?

MDP: Multidimensional Vision Model Pruning with Latency Constraint

Joint Optimization of Neural Radiance Fields and Continuous Camera Motion from a Monocular Video

PARC: A Quantitative Framework Uncovering the Symmetries within Vision Language Models

Hydra-NeXt: Robust Closed-Loop Driving with Open-Loop Training

BEVNeXt: Reviving Dense BEV Frameworks for 3D Object Detection

Improving Distant 3D Object Detection Using 2D Box Supervision

Mining Supervision for Dynamic Regions in Self-Supervised Monocular Depth Estimation

Emotion Recognition in Context

Cost Volume Pyramid Based Depth Inference for Multi-View Stereo

Dreaming to Distill: Data-Free Knowledge Transfer via DeepInversion

Optimal Quantization Using Scaled Codebook

Self-Supervised Learning of Depth Inference for Multi-View Stereo

See Through Gradients: Image Batch Recovery via GradInversion

FreeSOLO: Learning To Segment Objects Without Annotations

Non-Parametric Depth Distribution Modelling Based Depth Inference for Multi-View Stereo

Not All Labels Are Equal: Rationalizing the Labeling Costs for Training Object Detection

Panoptic SegFormer: Delving Deeper Into Panoptic Segmentation With Transformers

A-ViT: Adaptive Tokens for Efficient Vision Transformer

How Much More Data Do I Need? Estimating Requirements for Downstream Tasks

Vision Transformers Are Good Mask Auto-Labelers

Knowledge Distillation for 6D Pose Estimation by Aligning Distributions of Local Predictions

VoxFormer: Sparse Voxel Transformer for Camera-Based 3D Semantic Scene Completion

Bringing Background Into the Foreground: Making All Classes Equal in Weakly-Supervised Video Semantic Segmentation

Domain-Adaptive Deep Network Compression

Active Learning for Deep Object Detection via Probabilistic Modeling

Fully Attentional Networks with Self-emerging Token Labeling

FB-BEV: BEV Representation from Forward-Backward View Transformations

Towards Viewpoint Robustness in Bird's Eye View Segmentation

FocalFormer3D: Focusing on Hard Instance for 3D Object Detection

Parametric Depth Based Feature Representation Learning for Object Detection and Segmentation in Bird's-Eye View

When To Prune? A Policy Towards Early Structural Pruning

OmniDrive: A Holistic Vision-Language Dataset for Autonomous Driving with Counterfactual Reasoning

ExpandNets: Linear Over-parameterization to Train Compact Convolutional Networks

Distilling Image Classifiers in Object Detectors

SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers

Structural Pruning via Latency-Saliency Knapsack

Optimizing Data Collection for Machine Learning