Jian Wang

32
Papers
388
Total Citations

Papers (32)

SkySense: A Multi-Modal Remote Sensing Foundation Model Towards Universal Interpretation for Earth Observation Imagery

CVPR 2024
236
citations

RobustSAM: Segment Anything Robustly on Degraded Images

CVPR 2024
35
citations

Cooper: Coordinating Specialized Agents towards a Complex Dialogue Goal

AAAI 2024arXiv
27
citations

KABB: Knowledge-Aware Bayesian Bandits for Dynamic Expert Coordination in Multi-Agent Systems

ICML 2025
26
citations

DSL-FIQA: Assessing Facial Image Quality via Dual-Set Degradation Learning and Landmark-Guided Transformer

CVPR 2024
19
citations

Robust Communicative Multi-Agent Reinforcement Learning with Active Defense

AAAI 2024arXiv
8
citations

Training-Free Text-Guided Image Editing with Visual Autoregressive Model

ICCV 2025arXiv
7
citations

Ego4o: Egocentric Human Motion Capture and Understanding from Multi-Modal Input

CVPR 2025
5
citations

POT: Prototypical Optimal Transport for Weakly Supervised Semantic Segmentation

CVPR 2025
5
citations

Delving Deep into Engagement Prediction of Short Videos

ECCV 2024
5
citations

EcoMatcher: Efficient Clustering Oriented Matcher for Detector-free Image Matching

ECCV 2024
4
citations

Discrete Curvature Graph Information Bottleneck

AAAI 2025
3
citations

SceneMI: Motion In-betweening for Modeling Human-Scene Interaction

ICCV 2025
3
citations

Bring Your Rear Cameras for Egocentric 3D Human Pose Estimation

ICCV 2025
2
citations

Style Quantization for Data-Efficient GAN Training

CVPR 2025
2
citations

FRAME: Floor-aligned Representation for Avatar Motion from Egocentric Video

CVPR 2025
1
citations

MS$^3$D: A RG Flow-Based Regularization for GAN Training with Limited Data

ICML 2024
0
citations

SkySense-O: Towards Open-World Remote Sensing Interpretation with Vision-Centric Visual-Language Modeling

CVPR 2025
0
citations

Ponimator: Unfolding Interactive Pose for Versatile Human-human Interaction Animation

ICCV 2025
0
citations

T2Bs: Text-to-Character Blendshapes via Video Generation

ICCV 2025
0
citations

TextMaster: A Unified Framework for Realistic Text Editing via Glyph-Style Dual-Control

ICCV 2025
0
citations

RAGDiffusion: Faithful Cloth Generation via External Knowledge Assimilation

ICCV 2025
0
citations

Class Token as Proxy: Optimal Transport-assisted Proxy Learning for Weakly Supervised Semantic Segmentation

ICCV 2025
0
citations

Similar Modality Enhancement and Action Consistency Learning for Weakly Supervised Temporal Action Localization

AAAI 2025
0
citations

Federated Recommendation with Explicitly Encoding Item Bias

AAAI 2025
0
citations

3D Human Pose Perception from Egocentric Stereo Videos

CVPR 2024
0
citations

Towards Better Vision-Inspired Vision-Language Models

CVPR 2024
0
citations

EventEgo3D: 3D Human Motion Capture from Egocentric Event Streams

CVPR 2024
0
citations

REWIND: Real-Time Egocentric Whole-Body Motion Diffusion with Exemplar-Based Identity Conditioning

CVPR 2024
0
citations

Exponential Spectral Pursuit: An Effective Initialization Method for Sparse Phase Retrieval

ICML 2024
0
citations

Mobile Attention: Mobile-Friendly Linear-Attention for Vision Transformers

ICML 2024
0
citations

KVQ: Boosting Video Quality Assessment via Saliency-guided Local Perception

CVPR 2025
0
citations