Rongrong Ji
38
Papers
1,618
Total Citations
Papers (38)
MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models
NeurIPS 2025
1,227
citations
Autoregressive Queries for Adaptive Tracking with Spatio-Temporal Transformers
CVPR 2024
118
citations
Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation
CVPR 2024
89
citations
AffineQuant: Affine Transformation Quantization for Large Language Models
ICLR 2024
43
citations
Towards General Visual-Linguistic Face Forgery Detection
CVPR 2025
34
citations
AccDiffusion: An Accurate Method for Higher-Resolution Image Generation
ECCV 2024
28
citations
Attention Disturbance and Dual-Path Constraint Network for Occluded Person Re-identification
AAAI 2024arXiv
24
citations
AIGI-Holmes: Towards Explainable and Generalizable AI-Generated Image Detection via Multimodal Large Language Models
ICCV 2025arXiv
13
citations
CamoTeacher: Dual-Rotation Consistency Learning for Semi-Supervised Camouflaged Object Detection
ECCV 2024
11
citations
VTON-HandFit: Virtual Try-on for Arbitrary Hand Pose Guided by Hand Priors Embedding
CVPR 2025arXiv
10
citations
DiffAgent: Fast and Accurate Text-to-Image API Selection with Large Language Model
CVPR 2024
7
citations
FlashSloth : Lightning Multimodal Large Language Models via Embedded Visual Compression
CVPR 2025
4
citations
Determining Layer-wise Sparsity for Large Language Models Through a Theoretical Perspective
ICML 2025
3
citations
UniPTS: A Unified Framework for Proficient Post-Training Sparsity
CVPR 2024
3
citations
Few-Shot Image Quality Assessment via Adaptation of Vision-Language Models
ICCV 2025arXiv
2
citations
From Objects to Events: Unlocking Complex Visual Understanding in Object Detectors via LLM-guided Symbolic Reasoning
ICCV 2025
2
citations
Learning Image Demoireing from Unpaired Real Data
AAAI 2024
0
citations
PortraitBooth: A Versatile Portrait Model for Fast Identity-preserved Personalization
CVPR 2024
0
citations
GraCo: Granularity-Controllable Interactive Segmentation
CVPR 2024
0
citations
Inter2Former: Dynamic Hybrid Attention for Efficient High-Precision Interactive Segmentation
ICCV 2025
0
citations
FocSAM: Delving Deeply into Focused Objects in Segmenting Anything
CVPR 2024
0
citations
Aligning and Prompting Everything All at Once for Universal Visual Perception
CVPR 2024
0
citations
SVFR: A Unified Framework for Generalized Video Face Restoration
CVPR 2025
0
citations
DS-VLM: Diffusion Supervision Vision Language Model
ICML 2025
0
citations
polybasic Speculative Decoding Through a Theoretical Perspective
ICML 2025
0
citations
Outlier-aware Slicing for Post-Training Quantization in Vision Transformer
ICML 2024
0
citations
X-Oscar: A Progressive Framework for High-quality Text-guided 3D Animatable Avatar Generation
ICML 2024
0
citations
Evaluating and Analyzing Relationship Hallucinations in Large Vision-Language Models
ICML 2024
0
citations
SAM as the Guide: Mastering Pseudo-Label Refinement in Semi-Supervised Referring Expression Segmentation
ICML 2024
0
citations
CaM: Cache Merging for Memory-efficient LLMs Inference
ICML 2024
0
citations
Fast Text-to-3D-Aware Face Generation and Manipulation via Direct Cross-modal Mapping and Geometric Regularization
ICML 2024
0
citations
ERQ: Error Reduction for Post-Training Quantization of Vision Transformers
ICML 2024
0
citations
Integrating Global Context Contrast and Local Sensitivity for Blind Image Quality Assessment
ICML 2024
0
citations
OracleFusion: Assisting the Decipherment of Oracle Bone Script with Structurally Constrained Semantic Typography
ICCV 2025
0
citations
Adaptive Feature Selection for No-Reference Image Quality Assessment by Mitigating Semantic Noise Sensitivity
ICML 2024
0
citations
Learning Interleaved Image-Text Comprehension in Vision-Language Large Models
ICLR 2025
0
citations
Boosting Multimodal Large Language Models with Visual Tokens Withdrawal for Rapid Inference
AAAI 2025
0
citations
Semantic Alignment and Reinforcement for Data-Free Quantization of Vision Transformers
ICCV 2025
0
citations