Yizhou Yu

52

Papers

67

Total Citations

Papers (52)

SparX: A Sparse Cross-Layer Connection Mechanism for Hierarchical Vision Mamba and Transformer Networks

SegMAN: Omni-scale Context Modeling with State Space Models and Local Attention for Semantic Segmentation

OVER-NAV: Elevating Iterative Vision-and-Language Navigation with Open-Vocabulary Detection and StructurEd Representation

Autoregressive Sequence Modeling for 3D Medical Image Representation

Vision Function Layer in Multimodal LLMs

Visual Saliency Based on Multiscale Deep Features

Deep Contrast Learning for Salient Object Detection

Borrowing Treasures From the Wealthy: Deep Transfer Learning Through Selective Joint Fine-Tuning

Instance-Level Salient Object Segmentation

Multi-Evidence Filtering and Fusion for Multi-Label Classification, Object Detection and Semantic Segmentation Based on Weakly Supervised Learning

Weakly Supervised Complementary Parts Models for Fine-Grained Image Classification From the Bottom Up

Cross-Modal Relationship Inference for Grounding Referring Expressions

Multi-Source Weak Supervision for Saliency Detection

Cascaded Generative and Discriminative Learning for Microcalcification Detection in Breast Mammograms

Cross-View Correspondence Reasoning Based on Bipartite Graph Convolutional Network for Mammogram Mass Detection

Graph-Structured Referring Expression Reasoning in the Wild

I3Net: Implicit Instance-Invariant Network for Adapting One-Stage Object Detectors

Scene-Intuitive Agent for Remote Embodied Visual Grounding

Bottom-Up Shift and Reasoning for Referring Image Segmentation

Cross-Domain Adaptive Clustering for Semi-Supervised Domain Adaptation

Refer-It-in-RGBD: A Bottom-Up Approach for 3D Visual Grounding in RGBD Images

Coarse-To-Fine Domain Adaptive Semantic Segmentation With Photometric Alignment and Category-Center Regularization

Attribute Surrogates Learning and Spectral Tokens Pooling in Transformers for Few-Shot Learning

Scale-Equivalent Distillation for Semi-Supervised Object Detection

Compound Domain Generalization via Meta-Knowledge Encoding

MISC210K: A Large-Scale Dataset for Multi-Instance Semantic Correspondence

Improved Distribution Matching for Dataset Condensation

Harvesting Discriminative Meta Objects With Deep CNN Features for Scene Classification

Piecewise Flat Embedding for Image Segmentation

HD-CNN: Hierarchical Deep Convolutional Neural Networks for Large Scale Visual Recognition

High-Resolution Shape Completion Using Deep Neural Networks for Global Structure and Local Geometry Inference

Dynamic Graph Attention for Referring Expression Comprehension

Motion Guided Attention for Video Salient Object Detection

Align, Attend and Locate: Chest X-Ray Diagnosis via Contrast Induced Attention Network With Limited Supervision

Multi-Scale Matching Networks for Semantic Correspondence

Preservational Learning Improves Self-Supervised Medical Image Models by Reconstructing Diverse Contexts

GraphFPN: Graph Feature Pyramid Network for Object Detection

Dual Bipartite Graph Learning: A General Approach for Domain Adaptive Object Detection

EGC: Image Generation and Classification via a Diffusion Energy-Based Model

Activate and Reject: Towards Safe Domain Generalization under Category Shift

Propagating Over Phrase Relations for One-Stage Visual Grounding

One-Shot Medical Landmark Localization by Edge-Guided Transform and Noisy Landmark Refinement

Neighborhood Collective Estimation for Noisy Label Identification and Correction

Centrality and Consistency: Two-Stage Clean Samples Identification for Learning with Instance-Dependent Noisy Labels

ME-PCN: Point Completion Conditioned on Mask Emptiness

OverLoCK: An Overview-first-Look-Closely-next ConvNet with Context-Mixing Dynamic Kernels

Rethinking Discrete Tokens: Treating Them as Conditions for Continuous Autoregressive Image Synthesis

FedDiv: Collaborative Noise Filtering for Federated Learning with Noisy Labels

RegionGPT: Towards Region Understanding Vision Language Model

Transductive Zero-Shot Learning with Visual Structure Constraint

Mix and Reason: Reasoning over Semantic Topology with Data Mixing for Domain Generalization

CODA: Generalizing to Open and Unseen Domains with Compaction and Disambiguation