Bin Wang

26
Papers
601
Total Citations

Papers (26)

OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation

CVPR 2024
365
citations

ToolACE: Winning the Points of LLM Function Calling

ICLR 2025
114
citations

LEGION: Learning to Ground and Explain for Synthetic Image Detection

ICCV 2025
32
citations

Generate Subgoal Images before Act: Unlocking the Chain-of-Thought Reasoning in Diffusion Model for Robot Manipulation with Multimodal Prompts

CVPR 2024
25
citations

Towards Faithful XAI Evaluation via Generalization-Limited Backdoor Watermark

ICLR 2024
18
citations

Image Over Text: Transforming Formula Recognition Evaluation with Character Detection Matching

CVPR 2025arXiv
13
citations

Efficient Multi-modal Large Language Models via Progressive Consistency Distillation

NeurIPS 2025
8
citations

CN-RMA: Combined Network with Ray Marching Aggregation for 3D Indoor Object Detection from Multi-view Images

CVPR 2024
7
citations

Seq2Time: Sequential Knowledge Transfer for Video LLM Temporal Grounding

CVPR 2025
6
citations

ROSE: Remove Objects with Side Effects in Videos

NeurIPS 2025
4
citations

A New Dataset and Framework for Real-World Blurred Images Super-Resolution

ECCV 2024
3
citations

Walk Wisely on Graph: Knowledge Graph Reasoning with Dual Agents via Efficient Guidance-Exploration

AAAI 2025
2
citations

LLM4RSR: Large Language Models as Data Correctors for Robust Sequential Recommendation

AAAI 2025
2
citations

Towards Ship License Plate Recognition in the Wild: A Large Benchmark and Strong Baseline

AAAI 2025
1
citations

Stability and Generalization of Zeroth-Order Decentralized Stochastic Gradient Descent with Changing Topology

AAAI 2025
1
citations

Distributed Bilevel Optimization with Communication Compression

ICML 2024
0
citations

OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations

CVPR 2025
0
citations

Chimera: Improving Generalist Model with Domain-Specific Experts

ICCV 2025
0
citations

Beyond Single Images: Retrieval Self-Augmented Unsupervised Camouflaged Object Detection

ICCV 2025
0
citations

Hybrid Layout Control for Diffusion Transformer: Fewer Annotations, Superior Aesthetics

ICCV 2025
0
citations

OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation

ICCV 2025
0
citations

Spatiotemporal-aware Trend-Seasonality Decomposition Network for Traffic Flow Forecasting

AAAI 2025
0
citations

Reverse Distribution Based Video Moment Retrieval for Effective Bias Elimination

AAAI 2025
0
citations

IAA: Inner-Adaptor Architecture Empowers Frozen Large Language Model with Multimodal Capabilities

AAAI 2025
0
citations

W2P: Switching from Weak Supervision to Partial Supervision for Semantic Segmentation

AAAI 2024
0
citations

Shift the Lens: Environment-Aware Unsupervised Camouflaged Object Detection

CVPR 2025
0
citations