Si Liu
66
Papers
323
Total Citations
Papers (66)
Matching-CNN Meets KNN: Quasi-Parametric Human Parsing
CVPR 2015
168
citations
VideoEspresso: A Large-Scale Chain-of-Thought Dataset for Fine-Grained Video Reasoning via Core Frame Selection
CVPR 2025
54
citations
FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis
ECCV 2024arXiv
51
citations
Mixture Compressor for Mixture-of-Experts LLMs Gains More
ICLR 2025
23
citations
Controllable Navigation Instruction Generation with Chain of Thought Prompting
ECCV 2024
16
citations
UAV-Flow Colosseo: A Real-World Benchmark for Flying-on-a-Word UAV Imitation Learning
NeurIPS 2025
8
citations
FlexDrive: Toward Trajectory Flexibility in Driving Scene Gaussian Splatting Reconstruction and Rendering
CVPR 2025
2
citations
CoST: Efficient Collaborative Perception From Unified Spatiotemporal Perspective
ICCV 2025
1
citations
Customize your NeRF: Adaptive Source Driven 3D Scene Editing via Local-Global Iterative Training
CVPR 2024
0
citations
SAFDNet: A Simple and Effective Network for Fully Sparse 3D Object Detection
CVPR 2024
0
citations
EASE-DETR: Easing the Competition among Object Queries
CVPR 2024
0
citations
Communication-Efficient Collaborative Perception via Information Filling with Codebook
CVPR 2024
0
citations
Learning Background Prompts to Discover Implicit Knowledge for Open Vocabulary Object Detection
CVPR 2024
0
citations
Structural Sparse Tracking
CVPR 2015
0
citations
Diversity-Induced Multi-View Subspace Clustering
CVPR 2015
0
citations
SketchNet: Sketch Classification With Web Images
CVPR 2016
0
citations
Structural Correlation Filter for Robust Visual Tracking
CVPR 2016
0
citations
Surveillance Video Parsing With Single Frame Supervision
CVPR 2017arXiv
0
citations
Learning Adaptive Receptive Fields for Deep Image Parsing Network
CVPR 2017
0
citations
Building Detail-Sensitive Semantic Segmentation Networks With Polynomial Pooling
CVPR 2019
0
citations
PPDM: Parallel Point Detection and Matching for Real-Time Human-Object Interaction Detection
CVPR 2020arXiv
0
citations
AdversarialNAS: Adversarial Neural Architecture Search for GANs
CVPR 2020arXiv
0
citations
PSGAN: Pose and Expression Robust Spatial-Aware GAN for Customizable Makeup Transfer
CVPR 2020arXiv
0
citations
Referring Image Segmentation via Cross-Modal Progressive Comprehension
CVPR 2020arXiv
0
citations
A Real-Time Cross-Modality Correlation Filtering Method for Referring Expression Comprehension
CVPR 2020arXiv
0
citations
Reformulating HOI Detection As Adaptive Set Prediction
CVPR 2021arXiv
0
citations
Room-and-Object Aware Knowledge Reasoning for Remote Embodied Referring Expression
CVPR 2021
0
citations
General Instance Distillation for Object Detection
CVPR 2021arXiv
0
citations
Collaborative Spatial-Temporal Modeling for Language-Queried Video Actor Segmentation
CVPR 2021arXiv
0
citations
Differentiable Multi-Granularity Human Representation Learning for Instance-Aware Human Semantic Parsing
CVPR 2021arXiv
0
citations
Distribution-Aware Single-Stage Models for Multi-Person 3D Pose Estimation
CVPR 2022arXiv
0
citations
Reinforced Structured State-Evolution for Vision-Language Navigation
CVPR 2022arXiv
0
citations
GEN-VLKT: Simplify Association and Enhance Interaction Understanding for HOI Detection
CVPR 2022
0
citations
Language-Bridged Spatial-Temporal Interaction for Referring Video Object Segmentation
CVPR 2022
0
citations
3D-SPS: Single-Stage 3D Visual Grounding via Referred Point Progressive Selection
CVPR 2022
0
citations
Boosting Verified Training for Robust Image Classifications via Abstraction
CVPR 2023arXiv
0
citations
Object-Aware Distillation Pyramid for Open-Vocabulary Object Detection
CVPR 2023arXiv
0
citations
Bridging Search Region Interaction With Template for RGB-T Tracking
CVPR 2023
0
citations
Adaptive Zone-Aware Hierarchical Planner for Vision-Language Navigation
CVPR 2023
0
citations
Improving Weakly Supervised Temporal Action Localization by Bridging Train-Test Gap in Pseudo Labels
CVPR 2023arXiv
0
citations
DETR With Additional Global Aggregation for Cross-Domain Weakly Supervised Object Detection
CVPR 2023arXiv
0
citations
LLaVA-ST: A Multimodal Large Language Model for Fine-Grained Spatial-Temporal Understanding
CVPR 2025
0
citations
Towards Computational Baby Learning: A Weakly-Supervised Approach for Object Detection
ICCV 2015
0
citations
Human Parsing With Contextualized Convolutional Neural Network
ICCV 2015
0
citations
Low-Rank Tensor Constrained Multiview Subspace Clustering
ICCV 2015
0
citations
RGB-Infrared Cross-Modality Person Re-Identification via Joint Pixel and Feature Alignment
ICCV 2019
0
citations
Language-Guided Global Image Editing via Cross-Modal Cyclic Mechanism
ICCV 2021
0
citations
Omnidirectional Information Gathering for Knowledge Transfer-Based Audio-Visual Navigation
ICCV 2023arXiv
0
citations
Video Background Music Generation: Dataset, Method and Evaluation
ICCV 2023arXiv
0
citations
Object as Query: Lifting Any 2D Object Detector to 3D Detection
ICCV 2023arXiv
0
citations
Optimizing the Placement of Roadside LiDARs for Autonomous Driving
ICCV 2023
0
citations
Linguistic Structure Guided Context Modeling for Referring Image Segmentation
ECCV 2020
0
citations
PoseTrans: A Simple yet Effective Pose Transformation Augmentation for Human Pose Estimation
ECCV 2022
0
citations
HEAD: HEtero-Assists Distillation for Heterogeneous Object Detectors
ECCV 2022
0
citations
Anchor3DLane: Learning To Regress 3D Anchors for Monocular 3D Lane Detection
CVPR 2023arXiv
0
citations
Generative Map Priors for Collaborative BEV Semantic Segmentation
CVPR 2025
0
citations
Revisiting Audio-Visual Segmentation with Vision-Centric Transformer
CVPR 2025
0
citations
Instruction-Oriented Preference Alignment for Enhancing Multi-Modal Comprehension Capability of MLLMs
ICCV 2025
0
citations
CycleVAR: Repurposing Autoregressive Model for Unsupervised One-Step Image Translation
ICCV 2025
0
citations
Video2BEV: Transforming Drone Videos to BEVs for Video-based Geo-localization
ICCV 2025
0
citations
Unleashing the Temporal-Spatial Reasoning Capacity of GPT for Training-Free Audio and Language Referenced Video Object Segmentation
AAAI 2025
0
citations
GaussianPainter: Painting Point Cloud into 3D Gaussians with Normal Guidance
AAAI 2025
0
citations
Mining the Benefits of Two-stage and One-stage HOI Detection
NeurIPS 2021
0
citations
Boosting Verification of Deep Reinforcement Learning via Piece-Wise Linear Decision Neural Networks
NeurIPS 2023
0
citations
MARBLE: Music Audio Representation Benchmark for Universal Evaluation
NeurIPS 2023
0
citations
Open Category Detection with PAC Guarantees
ICML 2018
0
citations