Qing Li

48
Papers
216
Total Citations

Papers (48)

CLOVA: A Closed-LOop Visual Assistant with Tool Usage and Update

CVPR 2024
45
citations

Multi-modal Agent Tuning: Building a VLM-Driven Agent for Efficient Tool Usage

ICLR 2025
37
citations

Move to Understand a 3D Scene: Bridging Visual Grounding and Exploration for Efficient and Versatile Embodied Navigation

ICCV 2025arXiv
24
citations

Bongard-OpenWorld: Few-Shot Reasoning for Free-form Visual Concepts in the Real World

ICLR 2024
17
citations

Unveiling the Mist over 3D Vision-Language Understanding: Object-centric Evaluation with Chain-of-Analysis

CVPR 2025
17
citations

Neural-Symbolic Recursive Machine for Systematic Generalization

ICLR 2024
14
citations

CoRe: Context-Regularized Text Embedding Learning for Text-to-Image Personalization

AAAI 2025
11
citations

Embodied VideoAgent: Persistent Memory from Egocentric Videos and Embodied Sensors Enables Dynamic Scene Understanding

ICCV 2025
11
citations

MMKE-Bench: A Multimodal Editing Benchmark for Diverse Visual Knowledge

ICLR 2025
8
citations

Cross Initialization for Face Personalization of Text-to-Image Models

CVPR 2024
8
citations

ESE: Espresso Sentence Embeddings

ICLR 2025
7
citations

EvdCLIP: Improving Vision-Language Retrieval with Entity Visual Descriptions from Large Language Models

AAAI 2025
5
citations

Efficient Robustness Evaluation via Constraint Relaxation

AAAI 2025
3
citations

FIRM: Flexible Interactive Reflection ReMoval

AAAI 2025
3
citations

SAUCE: Selective Concept Unlearning in Vision-Language Models with Sparse Autoencoders

ICCV 2025
3
citations

SPC-GS: Gaussian Splatting with Semantic-Prompt Consistency for Indoor Open-World Free-view Synthesis from Sparse Inputs

CVPR 2025
2
citations

PairEdit: Learning Semantic Variations for Exemplar-based Image Editing

NeurIPS 2025
1
citations

Locally-Transferred Fisher Vectors for Texture Classification

ICCV 2017
0
citations

Why Does a Visual Question Have Different Answers?

ICCV 2019arXiv
0
citations

YouRefIt: Embodied Reference Understanding With Language and Gesture

ICCV 2021arXiv
0
citations

VLGrammar: Grounded Grammar Induction of Vision and Language

ICCV 2021arXiv
0
citations

3D-VisTA: Pre-trained Transformer for 3D Vision and Text Alignment

ICCV 2023
0
citations

A Competence-aware Curriculum for Visual Concepts Learning via Question Answering

ECCV 2020
0
citations

Suppressing Mislabeled Data via Grouping and Self-Attention

ECCV 2020
0
citations

Revolutionizing Encrypted Traffic Classification with MH-Net: A Multi-View Heterogeneous Graph Model

AAAI 2025
0
citations

Bridging the Vision-Brain Gap with an Uncertainty-Aware Blur Prior

CVPR 2025
0
citations

METASCENES: Towards Automated Replica Creation for Real-world 3D Scans

CVPR 2025
0
citations

Learning Normals of Noisy Points by Local Gradient-Aware Surface Filtering

ICCV 2025
0
citations

Explicitly Guided Difficulty-Controllable Visual Question Generation

AAAI 2025
0
citations

Automated Defect Report Generation for Enhanced Industrial Quality Control

AAAI 2024
0
citations

One-Step Forward and Backtrack: Overcoming Zig-Zagging in Loss-Aware Quantization Training

AAAI 2024
0
citations

An Embodied Generalist Agent in 3D World

ICML 2024
0
citations

End-to-End Neuro-Symbolic Reinforcement Learning with Textual Explanations

ICML 2024
0
citations

Fusing Subcategory Probabilities for Texture Classification

CVPR 2015
0
citations

VizWiz Grand Challenge: Answering Visual Questions From Blind People

CVPR 2018arXiv
0
citations

Unsupervised Cross-Dataset Person Re-Identification by Transfer Learning of Spatial-Temporal Patterns

CVPR 2018arXiv
0
citations

VizWiz-Priv: A Dataset for Recognizing the Presence and Purpose of Private Visual Information in Images Taken by Blind People

CVPR 2019
0
citations

LO-Net: Deep Real-Time Lidar Odometry

CVPR 2019
0
citations

GaitPart: Temporal Part-Based Model for Gait Recognition

CVPR 2020
0
citations

SHS-Net: Learning Signed Hyper Surfaces for Oriented Normal Estimation of Point Clouds

CVPR 2023
0
citations

Least Squares Generative Adversarial Networks

ICCV 2017arXiv
0
citations

HSurf-Net: Normal Estimation for 3D Point Clouds by Learning Hyper Surfaces

NeurIPS 2022
0
citations

Fairness Reprogramming

NeurIPS 2022
0
citations

SheetCopilot: Bringing Software Productivity to the Next Level through Large Language Models

NeurIPS 2023
0
citations

Learning non-Markovian Decision-Making from State-only Sequences

NeurIPS 2023
0
citations

Interpreting Unsupervised Anomaly Detection in Security via Rule Extraction

NeurIPS 2023
0
citations

NeuralGF: Unsupervised Point Normal Estimation by Learning Neural Gradient Function

NeurIPS 2023
0
citations

Metis: Understanding and Enhancing In-Network Regular Expressions

NeurIPS 2023
0
citations