Rongrong Ji

38
Papers
1,618
Total Citations

Papers (38)

MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models

NeurIPS 2025
1,227
citations

Autoregressive Queries for Adaptive Tracking with Spatio-Temporal Transformers

CVPR 2024
118
citations

Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation

CVPR 2024
89
citations

AffineQuant: Affine Transformation Quantization for Large Language Models

ICLR 2024
43
citations

Towards General Visual-Linguistic Face Forgery Detection

CVPR 2025
34
citations

AccDiffusion: An Accurate Method for Higher-Resolution Image Generation

ECCV 2024
28
citations

Attention Disturbance and Dual-Path Constraint Network for Occluded Person Re-identification

AAAI 2024arXiv
24
citations

AIGI-Holmes: Towards Explainable and Generalizable AI-Generated Image Detection via Multimodal Large Language Models

ICCV 2025arXiv
13
citations

CamoTeacher: Dual-Rotation Consistency Learning for Semi-Supervised Camouflaged Object Detection

ECCV 2024
11
citations

VTON-HandFit: Virtual Try-on for Arbitrary Hand Pose Guided by Hand Priors Embedding

CVPR 2025arXiv
10
citations

DiffAgent: Fast and Accurate Text-to-Image API Selection with Large Language Model

CVPR 2024
7
citations

FlashSloth : Lightning Multimodal Large Language Models via Embedded Visual Compression

CVPR 2025
4
citations

Determining Layer-wise Sparsity for Large Language Models Through a Theoretical Perspective

ICML 2025
3
citations

UniPTS: A Unified Framework for Proficient Post-Training Sparsity

CVPR 2024
3
citations

Few-Shot Image Quality Assessment via Adaptation of Vision-Language Models

ICCV 2025arXiv
2
citations

From Objects to Events: Unlocking Complex Visual Understanding in Object Detectors via LLM-guided Symbolic Reasoning

ICCV 2025
2
citations

Learning Image Demoireing from Unpaired Real Data

AAAI 2024
0
citations

PortraitBooth: A Versatile Portrait Model for Fast Identity-preserved Personalization

CVPR 2024
0
citations

GraCo: Granularity-Controllable Interactive Segmentation

CVPR 2024
0
citations

Inter2Former: Dynamic Hybrid Attention for Efficient High-Precision Interactive Segmentation

ICCV 2025
0
citations

FocSAM: Delving Deeply into Focused Objects in Segmenting Anything

CVPR 2024
0
citations

Aligning and Prompting Everything All at Once for Universal Visual Perception

CVPR 2024
0
citations

SVFR: A Unified Framework for Generalized Video Face Restoration

CVPR 2025
0
citations

DS-VLM: Diffusion Supervision Vision Language Model

ICML 2025
0
citations

polybasic Speculative Decoding Through a Theoretical Perspective

ICML 2025
0
citations

Outlier-aware Slicing for Post-Training Quantization in Vision Transformer

ICML 2024
0
citations

X-Oscar: A Progressive Framework for High-quality Text-guided 3D Animatable Avatar Generation

ICML 2024
0
citations

Evaluating and Analyzing Relationship Hallucinations in Large Vision-Language Models

ICML 2024
0
citations

SAM as the Guide: Mastering Pseudo-Label Refinement in Semi-Supervised Referring Expression Segmentation

ICML 2024
0
citations

CaM: Cache Merging for Memory-efficient LLMs Inference

ICML 2024
0
citations

Fast Text-to-3D-Aware Face Generation and Manipulation via Direct Cross-modal Mapping and Geometric Regularization

ICML 2024
0
citations

ERQ: Error Reduction for Post-Training Quantization of Vision Transformers

ICML 2024
0
citations

Integrating Global Context Contrast and Local Sensitivity for Blind Image Quality Assessment

ICML 2024
0
citations

OracleFusion: Assisting the Decipherment of Oracle Bone Script with Structurally Constrained Semantic Typography

ICCV 2025
0
citations

Adaptive Feature Selection for No-Reference Image Quality Assessment by Mitigating Semantic Noise Sensitivity

ICML 2024
0
citations

Learning Interleaved Image-Text Comprehension in Vision-Language Large Models

ICLR 2025
0
citations

Boosting Multimodal Large Language Models with Visual Tokens Withdrawal for Rapid Inference

AAAI 2025
0
citations

Semantic Alignment and Reinforcement for Data-Free Quantization of Vision Transformers

ICCV 2025
0
citations