Federico Tombari

29
Papers
229
Total Citations

Papers (29)

SecondPose: SE(3)-Consistent Dual-Stream Feature Fusion for Category-Level Pose Estimation

CVPR 2024
53
citations

LayoutVLM: Differentiable Optimization of 3D Layout via Vision-Language Models

CVPR 2025
44
citations

Learning to Prompt with Text Only Supervision for Vision-Language Models

AAAI 2025
40
citations

CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation

ICLR 2025arXiv
20
citations

Active Data Curation Effectively Distills Large-Scale Multimodal Models

CVPR 2025
14
citations

Omnia de EgoTempo: Benchmarking Temporal Understanding of Multi-Modal LLMs in Egocentric Videos

CVPR 2025
14
citations

LoRACLR: Contrastive Adaptation for Customization of Diffusion Models

CVPR 2025
11
citations

Semantic Library Adaptation: LoRA Retrieval and Fusion for Open-Vocabulary Semantic Segmentation

CVPR 2025
6
citations

Video Perception Models for 3D Scene Synthesis

NeurIPS 2025
5
citations

One2Any: One-Reference 6D Pose Estimation for Any Object

CVPR 2025
5
citations

Gatekeeper: Improving Model Cascades Through Confidence Tuning

NeurIPS 2025arXiv
4
citations

Test-Time Visual In-Context Tuning

CVPR 2025
4
citations

4D Gaussian Splatting SLAM

ICCV 2025
3
citations

KP-RED: Exploiting Semantic Keypoints for Joint 3D Shape Retrieval and Deformation

CVPR 2024
3
citations

Prior2Former - Evidential Modeling of Mask Transformers for Assumption-Free Open-World Panoptic Segmentation

ICCV 2025arXiv
2
citations

UIP2P: Unsupervised Instruction-based Image Editing via Edit Reversibility Constraint

ICCV 2025arXiv
1
citations

Extracting Training Data From Document-Based VQA Models

ICML 2024
0
citations

RelationField: Relate Anything in Radiance Fields

CVPR 2025
0
citations

ESCAPE: Equivariant Shape Completion via Anchor Point Encoding

CVPR 2025
0
citations

MOBIUS: Big-to-Mobile Universal Instance Segmentation via Multi-modal Bottleneck Fusion and Calibrated Decoder Pruning

ICCV 2025
0
citations

Contrastive Test-Time Composition of Multiple LoRA Models for Image Generation

ICCV 2025
0
citations

Hierarchical 3D Scene Graphs Construction Outdoors

ICCV 2025
0
citations

Object-X: Learning to Reconstruct Multi-Modal 3D Object Representations

NeurIPS 2025
0
citations

SceneFun3D: Fine-Grained Functionality and Affordance Understanding in 3D Scenes

CVPR 2024
0
citations

CONFORM: Contrast is All You Need for High-Fidelity Text-to-Image Diffusion Models

CVPR 2024
0
citations

Know Your Neighbors: Improving Single-View Reconstruction via Spatial Vision-Language Reasoning

CVPR 2024
0
citations

MOHO: Learning Single-view Hand-held Object Reconstruction with Multi-view Occlusion-Aware Supervision

CVPR 2024
0
citations

HyperSDFusion: Bridging Hierarchical Structures in Language and Geometry for Enhanced 3D Text2Shape Generation

CVPR 2024
0
citations

UNOPose: Unseen Object Pose Estimation with an Unposed RGB-D Reference Image

CVPR 2025
0
citations