Yulin Wang

24
Papers
280
Total Citations

Papers (24)

MetaSAug: Meta Semantic Augmentation for Long-Tailed Visual Recognition

CVPR 2021arXiv
191
citations

Revisiting Non-Autoregressive Transformers for Efficient Image Synthesis

CVPR 2024arXiv
28
citations

XLRS-Bench: Could Your Multimodal LLMs Understand Extremely Large Ultra-High-Resolution Remote Sensing Imagery?

CVPR 2025arXiv
24
citations

AdaNAT: Exploring Adaptive Policy for Token-Based Image Generation

ECCV 2024arXiv
15
citations

GeoLLaVA-8K: Scaling Remote-Sensing Multimodal Large Language Models to 8K Resolution

NeurIPS 2025arXiv
11
citations

RoMA: Scaling up Mamba-based Foundation Models for Remote Sensing

NeurIPS 2025arXiv
10
citations

IMG: Calibrating Diffusion Models via Implicit Multimodal Guidance

ICCV 2025arXiv
1
citations

Everything to the Synthetic: Diffusion-driven Test-time Adaptation via Synthetic-Domain Alignment

CVPR 2025arXiv
0
citations

EchoWorld: Learning Motion-Aware World Models for Echocardiography Probe Guidance

CVPR 2025arXiv
0
citations

CheXWorld: Exploring Image World Modeling for Radiograph Representation Learning

CVPR 2025
0
citations

HccePose (BF): Predicting Front & Back Surfaces to Construct Ultra-Dense 2D-3D Correspondences for Pose Estimation

ICCV 2025
0
citations

LazyMAR: Accelerating Masked Autoregressive Models via Feature Caching

ICCV 2025arXiv
0
citations

CondenseNet V2: Sparse Feature Reactivation for Deep Networks

CVPR 2021arXiv
0
citations

Transferable Semantic Augmentation for Domain Adaptation

CVPR 2021arXiv
0
citations

AdaFocus V2: End-to-End Training of Spatial Dynamic Networks for Video Recognition

CVPR 2022arXiv
0
citations

Adaptive Focus for Efficient Video Recognition

ICCV 2021arXiv
0
citations

Dynamic Perceiver for Efficient Visual Recognition

ICCV 2023arXiv
0
citations

Adaptive Rotated Convolution for Rotated Object Detection

ICCV 2023arXiv
0
citations

EfficientTrain: Exploring Generalized Curriculum Learning for Training Visual Backbones

ICCV 2023arXiv
0
citations

Deep Incubation: Training Large Models by Divide-and-Conquering

ICCV 2023arXiv
0
citations

Borrowing Knowledge From Pre-trained Language Model: A New Data-efficient Visual Learning Paradigm

ICCV 2023
0
citations

AdaFocusV3: On Unified Spatial-Temporal Dynamic Video Recognition

ECCV 2022
0
citations

Glance and Focus: a Dynamic Approach to Reducing Spatial Redundancy in Image Classification

NeurIPS 2020arXiv
0
citations

Not All Images are Worth 16x16 Words: Dynamic Transformers for Efficient Image Recognition

NeurIPS 2021arXiv
0
citations