Guanbin Li

33

Papers

131

Total Citations

Papers (33)

AlignSAM: Aligning Segment Anything Model to Open Context via Reinforcement Learning

NeRF-HuGS: Improved Neural Radiance Fields in Non-static Scenes Using Heuristics-Guided Segmentation

Decoupled Pseudo-labeling for Semi-Supervised Monocular 3D Object Detection

OVER-NAV: Elevating Iterative Vision-and-Language Navigation with Open-Vocabulary Detection and StructurEd Representation

Cell Graph Transformer for Nuclei Classification

Rethinking Query-based Transformer for Continual Image Segmentation

GeoSplatting: Towards Geometry Guided Gaussian Splatting for Physically-based Inverse Rendering

DreamFuse: Adaptive Image Fusion with Diffusion Transformer

DreamLayer: Simultaneous Multi-Layer Generation via Diffusion Model

UniCell: Universal Cell Nucleus Classification via Prompt Learning

Empowering Large Language Models with 3D Situation Awareness

Hierarchically Controlled Deformable 3D Gaussians for Talking Head Synthesis

Sim-DETR: Unlock DETR for Temporal Sentence Grounding

Bridging Knowledge Gap Between Image Inpainting and Large-Area Visible Watermark Removal

DeepShield: Fortifying Deepfake Video Detection with Local and Global Forgery Analysis

FakeRadar: Probing Forgery Outliers to Detect Unknown Deepfake Videos

Free-MoRef: Instantly Multiplexing Context Perception Capabilities of Video-MLLMs within Single Inference

Learning Background Prompts to Discover Implicit Knowledge for Open Vocabulary Object Detection

LLM-driven Multimodal and Multi-Identity Listening Head Generation

DAGSM: Disentangled Avatar Generation with GS-enhanced Mesh

DSPNet: Dual-vision Scene Perception for Robust 3D Question Answering

Towards Long-Horizon Vision-Language Navigation: Platform, Benchmark and Method

AdaDrive: Self-Adaptive Slow-Fast System for Language-Grounded Autonomous Driving

VLDrive: Vision-Augmented Lightweight MLLMs for Efficient Language-grounded Autonomous Driving

Beyond the Destination: A Novel Benchmark for Exploration-Aware Embodied Question Answering

GlassWizard: Harvesting Diffusion Priors for Glass Surface Detection

LaneDiffusion: Improving Centerline Graph Learning via Prior Injected BEV Feature Generation

FedDiv: Collaborative Noise Filtering for Federated Learning with Noisy Labels

Variance-Insensitive and Target-Preserving Mask Refinement for Interactive Image Segmentation

Removing Interference and Recovering Content Imaginatively for Visible Watermark Removal

Customize your NeRF: Adaptive Source Driven 3D Scene Editing via Local-Global Iterative Training

Open-Vocabulary Segmentation with Semantic-Assisted Calibration

VTON 360: High-Fidelity Virtual Try-On from Any Viewing Direction