Rita Cucchiara
11
Papers
48
Total Citations
Papers (11)
Talking to DINO: Bridging Self-Supervised Vision Backbones with Language for Open-Vocabulary Segmentation
ICCV 2025arXiv
22
citations
Augmenting Multimodal LLMs with Self-Reflective Tokens for Knowledge-based Visual Question Answering
CVPR 2025
10
citations
Merging and Splitting Diffusion Paths for Semantically Coherent Panoramas
ECCV 2024
6
citations
Recurrence-Enhanced Vision-and-Language Transformers for Robust Multimodal Document Retrieval
CVPR 2025
4
citations
Diffusion Transformers for Tabular Data Time Series Generation
ICLR 2025arXiv
3
citations
What Changed? Detecting and Evaluating Instruction-Guided Image Edits with Multimodal Large Language Models
ICCV 2025
2
citations
Modeling Human Gaze Behavior with Diffusion Models for Unified Scanpath Prediction
ICCV 2025
1
citations
Training-Free Open-Vocabulary Segmentation with Offline Diffusion-Augmented Prototype Generation
CVPR 2024
0
citations
Zero-Shot Styled Text Image Generation, but Make It Autoregressive
CVPR 2025
0
citations
MissRAG: Addressing the Missing Modality Challenge in Multimodal Large Language Models
ICCV 2025
0
citations
Hyperbolic Safety-Aware Vision-Language Models
CVPR 2025
0
citations