Linchao Zhu

43
Papers
76
Total Citations

Papers (43)

Knowledge-Enhanced Dual-stream Zero-shot Composed Image Retrieval

CVPR 2024
45
citations

VideoGrain: Modulating Space-Time Attention for Multi-Grained Video Editing

ICLR 2025arXiv
31
citations

From Trial to Triumph: Advancing Long Video Understanding via Visual Context Sample Scaling and Self-reward Alignment

ICCV 2025
0
citations

HUST: High-Fidelity Unbiased Skin Tone Estimation via Texture Quantization

ICCV 2025
0
citations

Detecting and Mitigating Hallucination in Large Vision Language Models via Fine-Grained AI Feedback

AAAI 2025
0
citations

DGL: Dynamic Global-Local Prompt Tuning for Text-Video Retrieval

AAAI 2024arXiv
0
citations

Stitching Segments and Sentences towards Generalization in Video-Text Pre-training

AAAI 2024
0
citations

CapHuman: Capture Your Moments in Parallel Universes

CVPR 2024
0
citations

Few-Shot Object Recognition From Machine-Labeled Web Images

CVPR 2017arXiv
0
citations

Bidirectional Multirate Reconstruction for Temporal Modeling in Videos

CVPR 2017arXiv
0
citations

Sim-Real Joint Reinforcement Transfer for 3D Indoor Navigation

CVPR 2019
0
citations

Inflated Episodic Memory With Region Self-Attention for Long-Tailed Visual Recognition

CVPR 2020
0
citations

Semantic Correspondence as an Optimal Transport Problem

CVPR 2020
0
citations

Learning Filter Pruning Criteria for Deep Convolutional Neural Networks Acceleration

CVPR 2020
0
citations

ActBERT: Learning Global-Local Video-Text Representations

CVPR 2020
0
citations

Gated Channel Transformation for Visual Recognition

CVPR 2020arXiv
0
citations

Faster Meta Update Strategy for Noise-Robust Deep Learning

CVPR 2021arXiv
0
citations

T2VLAD: Global-Local Sequence Alignment for Text-Video Retrieval

CVPR 2021arXiv
0
citations

OpenMix: Reviving Known Knowledge for Discovering Novel Visual Categories in an Open World

CVPR 2021arXiv
0
citations

Unified Transformer Tracker for Object Tracking

CVPR 2022arXiv
0
citations

SEEG: Semantic Energized Co-Speech Gesture Generation

CVPR 2022
0
citations

A Simple Episodic Linear Probe Improves Visual Recognition in the Wild

CVPR 2022
0
citations

Compositional Temporal Grounding With Structured Variational Cross-Graph Correspondence Learning

CVPR 2022arXiv
0
citations

Complex Video Action Reasoning via Learnable Markov Logic Network

CVPR 2022
0
citations

Efficient Multimodal Fusion via Interactive Prompting

CVPR 2023arXiv
0
citations

PointListNet: Deep Learning on 3D Point Lists

CVPR 2023
0
citations

MIST: Multi-Modal Iterative Spatial-Temporal Transformer for Long-Form Video Question Answering

CVPR 2023arXiv
0
citations

Auto-ReID: Searching for a Part-Aware ConvNet for Person Re-Identification

ICCV 2019
0
citations

Dual Attention Matching for Audio-Visual Event Localization

ICCV 2019
0
citations

Entangled Transformer for Image Captioning

ICCV 2019
0
citations

H3R: Hybrid Multi-view Correspondence for Generalizable 3D Reconstruction

ICCV 2025
0
citations

Universal-Prototype Enhancing for Few-Shot Object Detection

ICCV 2021arXiv
0
citations

A Multi-Mode Modulator for Multi-Domain Few-Shot Classification

ICCV 2021
0
citations

Vector-Decomposed Disentanglement for Domain-Invariant Object Detection

ICCV 2021arXiv
0
citations

Adaptive Hierarchical Graph Reasoning With Semantic Coherence for Video-and-Language Inference

ICCV 2021arXiv
0
citations

MAAL: Multimodality-Aware Autoencoder-Based Affordance Learning for 3D Articulated Objects

ICCV 2023
0
citations

SF-Net: Single-Frame Supervision for Temporal Action Localization

ECCV 2020
0
citations

Motion-Excited Sampler: Video Adversarial Attack with Sparked Prior

ECCV 2020
0
citations

Learning to Transfer Learn: Reinforcement Learning-Based Selection for Adaptive Transfer Learning

ECCV 2020
0
citations

Interactive Prototype Learning for Egocentric Action Recognition

ICCV 2021
0
citations

MC-Bench: A Benchmark for Multi-Context Visual Grounding in the Era of MLLMs

ICCV 2025
0
citations

Connective Cognition Network for Directional Visual Commonsense Reasoning

NeurIPS 2019
0
citations

Fine-Grained Semantically Aligned Vision-Language Pre-Training

NeurIPS 2022
0
citations