Jiale Cao

6

Papers

137

Total Citations

Papers (6)

SED: A Simple Encoder-Decoder for Open-Vocabulary Semantic Segmentation

VideoGLaMM : A Large Multimodal Model for Pixel-Level Visual Grounding in Videos

Glad: A Streaming Scene Generator for Autonomous Driving

Wavelet and Prototype Augmented Query-based Transformer for Pixel-level Surface Defect Detection

SSLFusion: Scale and Space Aligned Latent Fusion Model for Multimodal 3D Object Detection

CLIPer: Hierarchically Improving Spatial Representation of CLIP for Open-Vocabulary Semantic Segmentation