Si Liu

20

Papers

154

Total Citations

Papers (20)

VideoEspresso: A Large-Scale Chain-of-Thought Dataset for Fine-Grained Video Reasoning via Core Frame Selection

FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis

Mixture Compressor for Mixture-of-Experts LLMs Gains More

Controllable Navigation Instruction Generation with Chain of Thought Prompting

UAV-Flow Colosseo: A Real-World Benchmark for Flying-on-a-Word UAV Imitation Learning

FlexDrive: Toward Trajectory Flexibility in Driving Scene Gaussian Splatting Reconstruction and Rendering

CoST: Efficient Collaborative Perception From Unified Spatiotemporal Perspective

GaussianPainter: Painting Point Cloud into 3D Gaussians with Normal Guidance

Customize your NeRF: Adaptive Source Driven 3D Scene Editing via Local-Global Iterative Training

SAFDNet: A Simple and Effective Network for Fully Sparse 3D Object Detection

EASE-DETR: Easing the Competition among Object Queries

Communication-Efficient Collaborative Perception via Information Filling with Codebook

LLaVA-ST: A Multimodal Large Language Model for Fine-Grained Spatial-Temporal Understanding

Learning Background Prompts to Discover Implicit Knowledge for Open Vocabulary Object Detection

Generative Map Priors for Collaborative BEV Semantic Segmentation

Revisiting Audio-Visual Segmentation with Vision-Centric Transformer

Instruction-Oriented Preference Alignment for Enhancing Multi-Modal Comprehension Capability of MLLMs

CycleVAR: Repurposing Autoregressive Model for Unsupervised One-Step Image Translation

Video2BEV: Transforming Drone Videos to BEVs for Video-based Geo-localization

Unleashing the Temporal-Spatial Reasoning Capacity of GPT for Training-Free Audio and Language Referenced Video Object Segmentation