Qi Wu
25
Papers
500
Total Citations
1
Affiliations
Affiliations
Carnegie Mellon University
Papers (25)
NavGPT: Explicit Reasoning in Vision-and-Language Navigation with Large Language Models
AAAI 2024arXiv
276
citations
Context-I2W: Mapping Images to Context-Dependent Words for Accurate Zero-Shot Composed Image Retrieval
AAAI 2024arXiv
57
citations
3DGUT: Enabling Distorted Cameras and Secondary Rays in Gaussian Splatting
CVPR 2025
51
citations
Continual Self-supervised Learning: Towards Universal Multi-modal Medical Data Representation Learning
CVPR 2024
42
citations
Motion-Agent: A Conversational Framework for Human Motion Generation with LLMs
ICLR 2025arXiv
30
citations
WebVLN: Vision-and-Language Navigation on Websites
AAAI 2024arXiv
19
citations
PairAug: What Can Augmented Image-Text Pairs Do for Radiology?
CVPR 2024
12
citations
General Scene Adaptation for Vision-and-Language Navigation
ICLR 2025arXiv
10
citations
Invariant Random Forest: Tree-Based Model Solution for OOD Generalization
AAAI 2024arXiv
3
citations
Augmented Commonsense Knowledge for Remote Object Grounding
AAAI 2024arXiv
0
citations
The Causal Impact of Credit Lines on Spending Distributions
AAAI 2024
0
citations
Sparse Bayesian Deep Learning for Cross Domain Medical Image Reconstruction
AAAI 2024
0
citations
KPA-Tracker: Towards Robust and Real-Time Category-Level Articulated Object 6D Pose Tracking
AAAI 2024
0
citations
G-NeRF: Geometry-enhanced Novel View Synthesis from Single-View Images
CVPR 2024
0
citations
Dynamic Inertial Poser (DynaIP): Part-Based Motion Dynamics Learning for Enhanced Human Pose Estimation with Sparse Inertial Sensors
CVPR 2024
0
citations
Decomposing Disease Descriptions for Enhanced Pathology Detection: A Multi-Aspect Vision-Language Pre-training Framework
CVPR 2024
0
citations
ModaVerse: Efficiently Transforming Modalities with LLMs
CVPR 2024
0
citations
Reason-before-Retrieve: One-Stage Reflective Chain-of-Thoughts for Training-Free Zero-Shot Composed Image Retrieval
CVPR 2025
0
citations
Missing Target-Relevant Information Prediction with World Model for Accurate Zero-Shot Composed Image Retrieval
CVPR 2025
0
citations
EnvPoser: Environment-aware Realistic Human Motion Estimation from Sparse Observations with Uncertainty Modeling
CVPR 2025
0
citations
SAME: Learning Generic Language-Guided Visual Navigation with State-Adaptive Mixture of Experts
ICCV 2025
0
citations
COSMO: Combination of Selective Memorization for Low-cost Vision-and-Language Navigation
ICCV 2025
0
citations
MFL-Owner: Ownership Protection for Multi-modal Federated Learning via Orthogonal Transform Watermark
AAAI 2025
0
citations
Realistic Noise Synthesis with Diffusion Models
AAAI 2025
0
citations
Distributionally Robust Policy Evaluation and Learning for Continuous Treatment with Observational Data
AAAI 2025
0
citations