Qi Chen

32
Papers
152
Total Citations

Papers (32)

RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval

NeurIPS 2025
81
citations

WebVLN: Vision-and-Language Navigation on Websites

AAAI 2024arXiv
19
citations

CREAD: A Classification-Restoration Framework with Error Adaptive Discretization for Watch Time Prediction in Video Recommender Systems

AAAI 2024arXiv
15
citations

PairAug: What Can Augmented Image-Text Pairs Do for Radiology?

CVPR 2024
12
citations

From Words to Structured Visuals: A Benchmark and Framework for Text-to-Diagram Generation and Editing

CVPR 2025
11
citations

Attention-Driven GUI Grounding: Leveraging Pretrained Multimodal Large Language Models Without Fine-Tuning

AAAI 2025
9
citations

IPDN: Image-enhanced Prompt Decoding Network for 3D Referring Expression Segmentation

AAAI 2025
3
citations

Separation of Powers: On Segregating Knowledge from Observation in LLM-enabled Knowledge-based Visual Question Answering

CVPR 2025
1
citations

Dual Energy-Based Model with Open-World Uncertainty Estimation for Out-of-distribution Detection

CVPR 2025
1
citations

Closed-Loop Matters: Dual Regression Networks for Single Image Super-Resolution

CVPR 2020arXiv
0
citations

Intelligent Home 3D: Automatic 3D-House Design From Linguistic Descriptions Only

CVPR 2020arXiv
0
citations

Contrastive Neural Architecture Search With Neural Architecture Comparators

CVPR 2021arXiv
0
citations

V2C: Visual Voice Cloning

CVPR 2022arXiv
0
citations

Self-Supervised Image-Specific Prototype Exploration for Weakly Supervised Semantic Segmentation

CVPR 2022arXiv
0
citations

Prompt Switch: Efficient CLIP Adaptation for Text-Video Retrieval

ICCV 2023arXiv
0
citations

Object as Hotspots: An Anchor-Free 3D Object Detection Approach via Firing of Hotspots

ECCV 2020
0
citations

G-NeRF: Geometry-enhanced Novel View Synthesis from Single-View Images

CVPR 2024
0
citations

Training-Free Class Purification for Open-Vocabulary Semantic Segmentation

ICCV 2025
0
citations

Bitrate-Controlled Diffusion for Disentangling Motion and Content in Video

ICCV 2025
0
citations

Scaling Tumor Segmentation: Best Lessons from Real and Synthetic Data

ICCV 2025
0
citations

OVG-HQ: Online Video Grounding with Hybrid-modal Queries

ICCV 2025
0
citations

Seeing the Trees for the Forest: Rethinking Weakly-Supervised Medical Visual Grounding

ICCV 2025
0
citations

VQTalker: Towards Multilingual Talking Avatars Through Facial Motion Tokenization

AAAI 2025
0
citations

Enhancing Large Language Model Performance with Gradient-Based Parameter Selection

AAAI 2025
0
citations

Towards Generalizable Tumor Synthesis

CVPR 2024
0
citations

NAT: Neural Architecture Transformer for Accurate and Compact Architectures

NeurIPS 2019
0
citations

Every View Counts: Cross-View Consistency in 3D Object Detection with Hybrid-Cylindrical-Spherical Voxelization

NeurIPS 2020
0
citations

SPANN: Highly-efficient Billion-scale Approximate Nearest Neighborhood Search

NeurIPS 2021
0
citations

PolarStream: Streaming Object Detection and Segmentation with Polar Pillars

NeurIPS 2021
0
citations

Learning Distinct and Representative Modes for Image Captioning

NeurIPS 2022
0
citations

A Neural Corpus Indexer for Document Retrieval

NeurIPS 2022
0
citations

Model-enhanced Vector Index

NeurIPS 2023
0
citations