Bin Wang
26
Papers
601
Total Citations
Papers (26)
OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation
CVPR 2024
365
citations
ToolACE: Winning the Points of LLM Function Calling
ICLR 2025
114
citations
LEGION: Learning to Ground and Explain for Synthetic Image Detection
ICCV 2025
32
citations
Generate Subgoal Images before Act: Unlocking the Chain-of-Thought Reasoning in Diffusion Model for Robot Manipulation with Multimodal Prompts
CVPR 2024
25
citations
Towards Faithful XAI Evaluation via Generalization-Limited Backdoor Watermark
ICLR 2024
18
citations
Image Over Text: Transforming Formula Recognition Evaluation with Character Detection Matching
CVPR 2025arXiv
13
citations
Efficient Multi-modal Large Language Models via Progressive Consistency Distillation
NeurIPS 2025
8
citations
CN-RMA: Combined Network with Ray Marching Aggregation for 3D Indoor Object Detection from Multi-view Images
CVPR 2024
7
citations
Seq2Time: Sequential Knowledge Transfer for Video LLM Temporal Grounding
CVPR 2025
6
citations
ROSE: Remove Objects with Side Effects in Videos
NeurIPS 2025
4
citations
A New Dataset and Framework for Real-World Blurred Images Super-Resolution
ECCV 2024
3
citations
Walk Wisely on Graph: Knowledge Graph Reasoning with Dual Agents via Efficient Guidance-Exploration
AAAI 2025
2
citations
LLM4RSR: Large Language Models as Data Correctors for Robust Sequential Recommendation
AAAI 2025
2
citations
Towards Ship License Plate Recognition in the Wild: A Large Benchmark and Strong Baseline
AAAI 2025
1
citations
Stability and Generalization of Zeroth-Order Decentralized Stochastic Gradient Descent with Changing Topology
AAAI 2025
1
citations
Distributed Bilevel Optimization with Communication Compression
ICML 2024
0
citations
OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations
CVPR 2025
0
citations
Chimera: Improving Generalist Model with Domain-Specific Experts
ICCV 2025
0
citations
Beyond Single Images: Retrieval Self-Augmented Unsupervised Camouflaged Object Detection
ICCV 2025
0
citations
Hybrid Layout Control for Diffusion Transformer: Fewer Annotations, Superior Aesthetics
ICCV 2025
0
citations
OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation
ICCV 2025
0
citations
Spatiotemporal-aware Trend-Seasonality Decomposition Network for Traffic Flow Forecasting
AAAI 2025
0
citations
Reverse Distribution Based Video Moment Retrieval for Effective Bias Elimination
AAAI 2025
0
citations
IAA: Inner-Adaptor Architecture Empowers Frozen Large Language Model with Multimodal Capabilities
AAAI 2025
0
citations
W2P: Switching from Weak Supervision to Partial Supervision for Semantic Segmentation
AAAI 2024
0
citations
Shift the Lens: Environment-Aware Unsupervised Camouflaged Object Detection
CVPR 2025
0
citations