Ya Zhang

57
Papers
330
Total Citations

Papers (57)

Bottom-Up Temporal Action Localization with Mutual Regularization

ECCV 2020
209
citations

ReMamber: Referring Image Segmentation with Mamba Twister

ECCV 2024
49
citations

Audio-Visual Segmentation via Unlabeled Frame Exploitation

CVPR 2024
27
citations

Towards Universal Soccer Video Understanding

CVPR 2025
14
citations

Multi-Sentence Grounding for Long-term Instructional Video

ECCV 2024
12
citations

On Harmonizing Implicit Subpopulations

ICLR 2024
8
citations

Multi-modal Medical Diagnosis via Large-small Model Collaboration

CVPR 2025
4
citations

Learning to Instruct for Visual Instruction Tuning

NeurIPS 2025
3
citations

Fine-tuning with Reserved Majority for Noise Reduction

ICLR 2025
2
citations

Differential-informed Sample Selection Accelerates Multimodal Contrastive Learning

ICCV 2025
2
citations

Exploring Training on Heterogeneous Data with Mixture of Low-rank Adapters

ICML 2024
0
citations

Diversified Batch Selection for Training Acceleration

ICML 2024
0
citations

Part-Stacked CNN for Fine-Grained Visual Categorization

CVPR 2016
0
citations

Separating Style and Content for Generalized Style Transfer

CVPR 2018arXiv
0
citations

Actional-Structural Graph Convolutional Networks for Skeleton-Based Action Recognition

CVPR 2019
0
citations

Dynamic Multiscale Graph Neural Networks for 3D Skeleton Based Human Motion Prediction

CVPR 2020arXiv
0
citations

Iteratively-Refined Interactive 3D Medical Image Segmentation With Multi-Agent Reinforcement Learning

CVPR 2020arXiv
0
citations

Collaborative Motion Prediction via Neural Motion Message Passing

CVPR 2020arXiv
0
citations

A Fourier-Based Framework for Domain Generalization

CVPR 2021arXiv
0
citations

LAR-SR: A Local Autoregressive Model for Image Super-Resolution

CVPR 2022
0
citations

GroupNet: Multiscale Hypergraph Neural Networks for Trajectory Prediction With Relational Reasoning

CVPR 2022arXiv
0
citations

Task Decoupled Framework for Reference-Based Super-Resolution

CVPR 2022
0
citations

Distilling Vision-Language Pre-Training To Collaborate With Weakly-Supervised Temporal Action Localization

CVPR 2023arXiv
0
citations

Controllable Mesh Generation Through Sparse Latent Point Diffusion Models

CVPR 2023arXiv
0
citations

DR2: Diffusion-Based Robust Degradation Remover for Blind Face Restoration

CVPR 2023arXiv
0
citations

Federated Domain Generalization With Generalization Adjustment

CVPR 2023
0
citations

Class-Balancing Diffusion Models

CVPR 2023arXiv
0
citations

Augmenting Strong Supervision Using Web Data for Fine-Grained Categorization

ICCV 2015
0
citations

SORT: Second-Order Response Transform for Visual Recognition

ICCV 2017arXiv
0
citations

Accelerate CNN via Recursive Bayesian Pruning

ICCV 2019
0
citations

CaT: Weakly Supervised Object Detection With Category Transfer

ICCV 2021arXiv
0
citations

Divide and Conquer for Single-Frame Temporal Action Localization

ICCV 2021
0
citations

MedKLIP: Medical Knowledge Enhanced Language-Image Pre-Training for X-ray Diagnosis

ICCV 2023
0
citations

Joint-Relation Transformer for Multi-Person Motion Prediction

ICCV 2023arXiv
0
citations

Open-vocabulary Object Segmentation with Diffusion Models

ICCV 2023arXiv
0
citations

FTL: A universal framework for training low-bit DNNs via Feature Transfer

ECCV 2020
0
citations

Skeleton-Parted Graph Scattering Networks for 3D Human Motion Prediction

ECCV 2022
0
citations

Registration Based Few-Shot Anomaly Detection

ECCV 2022
0
citations

Prompting Visual-Language Models for Efficient Video Understanding

ECCV 2022
0
citations

Enhanced Multimodal Representation Learning With Cross-Modal KD

CVPR 2023
0
citations

MRGen: Segmentation Data Engine For Underrepresented MRI Modalities

ICCV 2025
0
citations

RAD: Towards Trustworthy Retrieval-Augmented Multi-modal Clinical Diagnosis

NeurIPS 2025
0
citations

MedBench: A Large-Scale Chinese Benchmark for Evaluating Medical Large Language Models

AAAI 2024
0
citations

Low-Rank Knowledge Decomposition for Medical Foundation Models

CVPR 2024
0
citations

Mitigating Noisy Correspondence by Geometrical Structure Consistency Learning

CVPR 2024
0
citations

Adapting Visual-Language Models for Generalizable Anomaly Detection in Medical Images

CVPR 2024
0
citations

HarmoDT: Harmony Multi-Task Decision Transformer for Offline Reinforcement Learning

ICML 2024
0
citations

Q-value Regularized Transformer for Offline Reinforcement Learning

ICML 2024
0
citations

Locally Estimated Global Perturbations are Better than Local Perturbations for Federated Sharpness-aware Minimization

ICML 2024
0
citations

Masking: A New Perspective of Noisy Supervision

NeurIPS 2018
0
citations

Graph Cross Networks with Vertex Infomax Pooling

NeurIPS 2020
0
citations

Collaborative Uncertainty in Multi-Agent Trajectory Forecasting

NeurIPS 2021
0
citations

AttrSeg: Open-Vocabulary Semantic Segmentation via Attribute Decomposition-Aggregation

NeurIPS 2023
0
citations

Combating Representation Learning Disparity with Geometric Harmonization

NeurIPS 2023
0
citations

Asynchrony-Robust Collaborative Perception via Bird's Eye View Flow

NeurIPS 2023
0
citations

Federated Learning with Bilateral Curation for Partially Class-Disjoint Data

NeurIPS 2023
0
citations

Uncovering Prototypical Knowledge for Weakly Open-Vocabulary Semantic Segmentation

NeurIPS 2023
0
citations