Jian Wang
32
Papers
388
Total Citations
Papers (32)
SkySense: A Multi-Modal Remote Sensing Foundation Model Towards Universal Interpretation for Earth Observation Imagery
CVPR 2024
236
citations
RobustSAM: Segment Anything Robustly on Degraded Images
CVPR 2024
35
citations
Cooper: Coordinating Specialized Agents towards a Complex Dialogue Goal
AAAI 2024arXiv
27
citations
KABB: Knowledge-Aware Bayesian Bandits for Dynamic Expert Coordination in Multi-Agent Systems
ICML 2025
26
citations
DSL-FIQA: Assessing Facial Image Quality via Dual-Set Degradation Learning and Landmark-Guided Transformer
CVPR 2024
19
citations
Robust Communicative Multi-Agent Reinforcement Learning with Active Defense
AAAI 2024arXiv
8
citations
Training-Free Text-Guided Image Editing with Visual Autoregressive Model
ICCV 2025arXiv
7
citations
Ego4o: Egocentric Human Motion Capture and Understanding from Multi-Modal Input
CVPR 2025
5
citations
POT: Prototypical Optimal Transport for Weakly Supervised Semantic Segmentation
CVPR 2025
5
citations
Delving Deep into Engagement Prediction of Short Videos
ECCV 2024
5
citations
EcoMatcher: Efficient Clustering Oriented Matcher for Detector-free Image Matching
ECCV 2024
4
citations
Discrete Curvature Graph Information Bottleneck
AAAI 2025
3
citations
SceneMI: Motion In-betweening for Modeling Human-Scene Interaction
ICCV 2025
3
citations
Bring Your Rear Cameras for Egocentric 3D Human Pose Estimation
ICCV 2025
2
citations
Style Quantization for Data-Efficient GAN Training
CVPR 2025
2
citations
FRAME: Floor-aligned Representation for Avatar Motion from Egocentric Video
CVPR 2025
1
citations
MS$^3$D: A RG Flow-Based Regularization for GAN Training with Limited Data
ICML 2024
0
citations
SkySense-O: Towards Open-World Remote Sensing Interpretation with Vision-Centric Visual-Language Modeling
CVPR 2025
0
citations
Ponimator: Unfolding Interactive Pose for Versatile Human-human Interaction Animation
ICCV 2025
0
citations
T2Bs: Text-to-Character Blendshapes via Video Generation
ICCV 2025
0
citations
TextMaster: A Unified Framework for Realistic Text Editing via Glyph-Style Dual-Control
ICCV 2025
0
citations
RAGDiffusion: Faithful Cloth Generation via External Knowledge Assimilation
ICCV 2025
0
citations
Class Token as Proxy: Optimal Transport-assisted Proxy Learning for Weakly Supervised Semantic Segmentation
ICCV 2025
0
citations
Similar Modality Enhancement and Action Consistency Learning for Weakly Supervised Temporal Action Localization
AAAI 2025
0
citations
Federated Recommendation with Explicitly Encoding Item Bias
AAAI 2025
0
citations
3D Human Pose Perception from Egocentric Stereo Videos
CVPR 2024
0
citations
Towards Better Vision-Inspired Vision-Language Models
CVPR 2024
0
citations
EventEgo3D: 3D Human Motion Capture from Egocentric Event Streams
CVPR 2024
0
citations
REWIND: Real-Time Egocentric Whole-Body Motion Diffusion with Exemplar-Based Identity Conditioning
CVPR 2024
0
citations
Exponential Spectral Pursuit: An Effective Initialization Method for Sparse Phase Retrieval
ICML 2024
0
citations
Mobile Attention: Mobile-Friendly Linear-Attention for Vision Transformers
ICML 2024
0
citations
KVQ: Boosting Video Quality Assessment via Saliency-guided Local Perception
CVPR 2025
0
citations