Rongrong Ji

138
Papers
1,842
Total Citations

Papers (138)

MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models

NeurIPS 2025
1,227
citations

Multiple Expert Brainstorming for Domain Adaptive Person Re-identification

ECCV 2020
188
citations

Autoregressive Queries for Adaptive Tracking with Spatio-Temporal Transformers

CVPR 2024
118
citations

Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation

CVPR 2024
89
citations

Enabling Deep Residual Networks for Weakly Supervised Object Detection

ECCV 2020
49
citations

AffineQuant: Affine Transformation Quantization for Large Language Models

ICLR 2024
43
citations

Towards General Visual-Linguistic Face Forgery Detection

CVPR 2025
34
citations

AccDiffusion: An Accurate Method for Higher-Resolution Image Generation

ECCV 2024
28
citations

Attention Disturbance and Dual-Path Constraint Network for Occluded Person Re-identification

AAAI 2024arXiv
24
citations

CamoTeacher: Dual-Rotation Consistency Learning for Semi-Supervised Camouflaged Object Detection

ECCV 2024
11
citations

VTON-HandFit: Virtual Try-on for Arbitrary Hand Pose Guided by Hand Priors Embedding

CVPR 2025
10
citations

DiffAgent: Fast and Accurate Text-to-Image API Selection with Large Language Model

CVPR 2024
7
citations

FlashSloth : Lightning Multimodal Large Language Models via Embedded Visual Compression

CVPR 2025
4
citations

Determining Layer-wise Sparsity for Large Language Models Through a Theoretical Perspective

ICML 2025
3
citations

UniPTS: A Unified Framework for Proficient Post-Training Sparsity

CVPR 2024
3
citations

Few-Shot Image Quality Assessment via Adaptation of Vision-Language Models

ICCV 2025arXiv
2
citations

From Objects to Events: Unlocking Complex Visual Understanding in Object Detectors via LLM-guided Symbolic Reasoning

ICCV 2025
2
citations

Evaluating and Analyzing Relationship Hallucinations in Large Vision-Language Models

ICML 2024
0
citations

SAM as the Guide: Mastering Pseudo-Label Refinement in Semi-Supervised Referring Expression Segmentation

ICML 2024
0
citations

CaM: Cache Merging for Memory-efficient LLMs Inference

ICML 2024
0
citations

Fast Text-to-3D-Aware Face Generation and Manipulation via Direct Cross-modal Mapping and Geometric Regularization

ICML 2024
0
citations

ERQ: Error Reduction for Post-Training Quantization of Vision Transformers

ICML 2024
0
citations

Integrating Global Context Contrast and Local Sensitivity for Blind Image Quality Assessment

ICML 2024
0
citations

Adaptive Feature Selection for No-Reference Image Quality Assessment by Mitigating Semantic Noise Sensitivity

ICML 2024
0
citations

Towards 3D Object Detection With Bimodal Deep Boltzmann Machines Over RGBD Imagery

CVPR 2015
0
citations

Understanding Image Structure via Hierarchical Shape Parsing

CVPR 2015
0
citations

Cross-Modality Binary Code Learning via Fusion Similarity Hashing

CVPR 2017
0
citations

GVCNN: Group-View Convolutional Neural Networks for 3D Shape Recognition

CVPR 2018
0
citations

Modulated Convolutional Networks

CVPR 2018arXiv
0
citations

GroupCap: Group-Based Image Captioning With Structured Relevance and Diversity Constraints

CVPR 2018
0
citations

Generative Adversarial Learning Towards Fast Weakly Supervised Detection

CVPR 2018
0
citations

Cyclic Guidance for Weakly Supervised Joint Detection and Segmentation

CVPR 2019
0
citations

Circulant Binary Convolutional Networks: Enhancing the Performance of 1-Bit DCNNs With Circulant Back Propagation

CVPR 2019
0
citations

Towards Optimal Structured CNN Pruning via Generative Adversarial Learning

CVPR 2019
0
citations

Exploiting Kernel Sparsity and Entropy for Interpretable CNN Compression

CVPR 2019
0
citations

Towards Visual Feature Translation

CVPR 2019arXiv
0
citations

Pyramidal Person Re-IDentification via Multi-Loss Dynamic Training

CVPR 2019
0
citations

HRank: Filter Pruning Using High-Rank Feature Map

CVPR 2020arXiv
0
citations

Salience-Guided Cascaded Suppression Network for Person Re-Identification

CVPR 2020
0
citations

Projection & Probability-Driven Black-Box Attack

CVPR 2020arXiv
0
citations

Cogradient Descent for Bilinear Optimization

CVPR 2020arXiv
0
citations

AD-Cluster: Augmented Discriminative Clustering for Domain Adaptive Person Re-Identification

CVPR 2020
0
citations

Siamese Box Adaptive Network for Visual Tracking

CVPR 2020arXiv
0
citations

Filter Grafting for Deep Neural Networks

CVPR 2020arXiv
0
citations

Rethinking Performance Estimation in Neural Architecture Search

CVPR 2020arXiv
0
citations

Multi-Task Collaborative Network for Joint Referring Expression Comprehension and Segmentation

CVPR 2020arXiv
0
citations

One-Shot Adversarial Attacks on Visual Tracking With Dual Attention

CVPR 2020
0
citations

Noise-Aware Fully Webly Supervised Object Detection

CVPR 2020
0
citations

Toward Joint Thing-and-Stuff Mining for Weakly Supervised Panoptic Segmentation

CVPR 2021
0
citations

Towards Compact CNNs via Collaborative Compression

CVPR 2021arXiv
0
citations

Discover Cross-Modality Nuances for Visible-Infrared Person Re-Identification

CVPR 2021
0
citations

Image-to-Image Translation via Hierarchical Style Disentanglement

CVPR 2021arXiv
0
citations

Beyond Max-Margin: Class Margin Equilibrium for Few-Shot Object Detection

CVPR 2021
0
citations

Removing the Background by Adding the Background: Towards Background Robust Self-Supervised Video Representation Learning

CVPR 2021arXiv
0
citations

RSTNet: Captioning With Adaptive Attention on Visual and Non-Visual Words

CVPR 2021
0
citations

DIFNet: Boosting Visual Information Flow for Image Captioning

CVPR 2022
0
citations

Active Teacher for Semi-Supervised Object Detection

CVPR 2022
0
citations

Boosting Crowd Counting via Multifaceted Attention

CVPR 2022arXiv
0
citations

Neural Architecture Search With Representation Mutual Information

CVPR 2022
0
citations

Training-Free Transformer Architecture Search

CVPR 2022arXiv
0
citations

IntraQ: Learning Synthetic Images With Intra-Class Heterogeneity for Zero-Shot Network Quantization

CVPR 2022arXiv
0
citations

RefCLIP: A Universal Teacher for Weakly Supervised Referring Expression Comprehension

CVPR 2023
0
citations

You Only Segment Once: Towards Real-Time Panoptic Segmentation

CVPR 2023arXiv
0
citations

Solving Oscillation Problem in Post-Training Quantization Through a Theoretical Perspective

CVPR 2023arXiv
0
citations

Meta Architecture for Point Cloud Analysis

CVPR 2023arXiv
0
citations

STAR Loss: Reducing Semantic Ambiguity in Facial Landmark Detection

CVPR 2023
0
citations

Clover: Towards a Unified Video-Language Alignment and Fusion Model

CVPR 2023arXiv
0
citations

Discriminator-Cooperated Feature Map Distillation for GAN Compression

CVPR 2023arXiv
0
citations

DistilPose: Tokenized Pose Regression With Heatmap Distillation

CVPR 2023arXiv
0
citations

RefTeacher: A Strong Baseline for Semi-Supervised Referring Expression Comprehension

CVPR 2023
0
citations

Top Rank Supervised Binary Coding for Visual Search

ICCV 2015
0
citations

Multinomial Distribution Learning for Effective Neural Architecture Search

ICCV 2019
0
citations

Universal Adversarial Perturbation via Prior Driven Uncertainty Approximation

ICCV 2019
0
citations

Structured Modeling of Joint Deep Feature and Prediction Refinement for Salient Object Detection

ICCV 2019
0
citations

Universal Perturbation Attack Against Image Retrieval

ICCV 2019
0
citations

Bayesian Optimized 1-Bit CNNs

ICCV 2019
0
citations

Scoot: A Perceptual Metric for Facial Sketches

ICCV 2019
0
citations

Architecture Disentanglement for Deep Neural Networks

ICCV 2021arXiv
0
citations

TRAR: Routing the Attention Spans in Transformer for Visual Question Answering

ICCV 2021
0
citations

ReCU: Reviving the Dead Weights in Binary Neural Networks

ICCV 2021arXiv
0
citations

EC-DARTS: Inducing Equalized and Consistent Optimization Into DARTS

ICCV 2021
0
citations

Aha! Adaptive History-Driven Attack for Decision-Based Black-Box Models

ICCV 2021
0
citations

Seminar Learning for Click-Level Weakly Supervised Semantic Segmentation

ICCV 2021arXiv
0
citations

Parallel Detection-and-Segmentation Learning for Weakly Supervised Instance Segmentation

ICCV 2021
0
citations

Occlude Them All: Occlusion-Aware Attention Network for Occluded Person Re-ID

ICCV 2021
0
citations

Pseudo-label Alignment for Semi-supervised Instance Segmentation

ICCV 2023arXiv
0
citations

AutoDiffusion: Training-Free Optimization of Time Steps and Architectures for Automated Diffusion Model Acceleration

ICCV 2023arXiv
0
citations

DiffRate : Differentiable Compression Rate for Efficient Vision Transformers

ICCV 2023arXiv
0
citations

Category-aware Allocation Transformer for Weakly Supervised Object Localization

ICCV 2023
0
citations

X-Mesh: Towards Fast and Accurate Text-driven 3D Stylization via Dynamic Textual Guidance

ICCV 2023
0
citations

SMMix: Self-Motivated Image Mixing for Vision Transformers

ICCV 2023arXiv
0
citations

Automatic Network Pruning via Hilbert-Schmidt Independence Criterion Lasso under Information Bottleneck Principle

ICCV 2023
0
citations

Anti-Bandit Neural Architecture Search for Model Defense

ECCV 2020
0
citations

API-Net: Robust Generative Classifier via a Single Discriminator

ECCV 2020
0
citations

SSCGAN: Facial Attribute Editing via Style Skip Connections

ECCV 2020
0
citations

Interpretable Neural Network Decoupling

ECCV 2020
0
citations

PAMS: Quantized Super-Resolution via Parameterized Max Scale

ECCV 2020
0
citations

Improving Face Recognition from Hard Samples via Distribution Distillation Loss

ECCV 2020
0
citations

Black-Box Dissector: Towards Erasing-Based Hard-Label Model Stealing Attack

ECCV 2022
0
citations

ECO-TR: Efficient Correspondences Finding via Coarse-to-Fine Refinement

ECCV 2022
0
citations

Fine-Grained Data Distribution Alignment for Post-Training Quantization

ECCV 2022
0
citations

Privacy-Preserving Face Recognition with Learnable Privacy Budgets in Frequency Domain

ECCV 2022
0
citations

An Information Theoretic Approach for Attention-Driven Face Forgery Detection

ECCV 2022
0
citations

PixelFolder: An Efficient Progressive Pixel Synthesis Network for Image Generation

ECCV 2022arXiv
0
citations

Dynamic Dual Trainable Bounds for Ultra-Low Precision Super-Resolution Networks

ECCV 2022
0
citations

ARM: Any-Time Super-Resolution Method

ECCV 2022
0
citations

SeqTR: A Simple Yet Universal Network for Visual Grounding

ECCV 2022
0
citations

InterFormer: Real-time Interactive Image Segmentation

ICCV 2023arXiv
0
citations

SVFR: A Unified Framework for Generalized Video Face Restoration

CVPR 2025
0
citations

AIGI-Holmes: Towards Explainable and Generalizable AI-Generated Image Detection via Multimodal Large Language Models

ICCV 2025
0
citations

Inter2Former: Dynamic Hybrid Attention for Efficient High-Precision Interactive Segmentation

ICCV 2025
0
citations

Semantic Alignment and Reinforcement for Data-Free Quantization of Vision Transformers

ICCV 2025
0
citations

OracleFusion: Assisting the Decipherment of Oracle Bone Script with Structurally Constrained Semantic Typography

ICCV 2025
0
citations

Learning Interleaved Image-Text Comprehension in Vision-Language Large Models

ICLR 2025
0
citations

Boosting Multimodal Large Language Models with Visual Tokens Withdrawal for Rapid Inference

AAAI 2025
0
citations

Learning Image Demoireing from Unpaired Real Data

AAAI 2024
0
citations

PortraitBooth: A Versatile Portrait Model for Fast Identity-preserved Personalization

CVPR 2024
0
citations

GraCo: Granularity-Controllable Interactive Segmentation

CVPR 2024
0
citations

FocSAM: Delving Deeply into Focused Objects in Segmenting Anything

CVPR 2024
0
citations

Aligning and Prompting Everything All at Once for Universal Visual Perception

CVPR 2024
0
citations

DS-VLM: Diffusion Supervision Vision Language Model

ICML 2025
0
citations

polybasic Speculative Decoding Through a Theoretical Perspective

ICML 2025
0
citations

Outlier-aware Slicing for Post-Training Quantization in Vision Transformer

ICML 2024
0
citations

X-Oscar: A Progressive Framework for High-quality Text-guided 3D Animatable Avatar Generation

ICML 2024
0
citations

FreeAnchor: Learning to Match Anchors for Visual Object Detection

NeurIPS 2019
0
citations

Information Competing Process for Learning Diversified Representations

NeurIPS 2019
0
citations

Variational Structured Semantic Inference for Diverse Image Captioning

NeurIPS 2019
0
citations

UWSOD: Toward Fully-Supervised-Level Capacity Weakly Supervised Object Detection

NeurIPS 2020
0
citations

Rotated Binary Neural Network

NeurIPS 2020
0
citations

Revisiting Discriminator in GAN Compression: A Generator-discriminator Cooperative Compression Scheme

NeurIPS 2021
0
citations

Learning Best Combination for Efficient N:M Sparsity

NeurIPS 2022
0
citations

Make Sharpness-Aware Minimization Stronger: A Sparsified Perturbation Approach

NeurIPS 2022
0
citations

PyramidCLIP: Hierarchical Feature Alignment for Vision-language Model Pretraining

NeurIPS 2022
0
citations

Improving Adversarial Robustness via Information Bottleneck Distillation

NeurIPS 2023
0
citations

Discover and Align Taxonomic Context Priors for Open-world Semi-Supervised Learning

NeurIPS 2023
0
citations

Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large Language Models

NeurIPS 2023
0
citations

Parameter and Computation Efficient Transfer Learning for Vision-Language Pre-trained Models

NeurIPS 2023
0
citations

CAPro: Webly Supervised Learning with Cross-modality Aligned Prototypes

NeurIPS 2023
0
citations