Bin Wang
47
Papers
599
Total Citations
Papers (47)
OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation
CVPR 2024
365
citations
ToolACE: Winning the Points of LLM Function Calling
ICLR 2025
114
citations
LEGION: Learning to Ground and Explain for Synthetic Image Detection
ICCV 2025
32
citations
Generate Subgoal Images before Act: Unlocking the Chain-of-Thought Reasoning in Diffusion Model for Robot Manipulation with Multimodal Prompts
CVPR 2024
25
citations
Towards Faithful XAI Evaluation via Generalization-Limited Backdoor Watermark
ICLR 2024
18
citations
Image Over Text: Transforming Formula Recognition Evaluation with Character Detection Matching
CVPR 2025
11
citations
Efficient Multi-modal Large Language Models via Progressive Consistency Distillation
NeurIPS 2025
8
citations
CN-RMA: Combined Network with Ray Marching Aggregation for 3D Indoor Object Detection from Multi-view Images
CVPR 2024
7
citations
Seq2Time: Sequential Knowledge Transfer for Video LLM Temporal Grounding
CVPR 2025
6
citations
ROSE: Remove Objects with Side Effects in Videos
NeurIPS 2025
4
citations
A New Dataset and Framework for Real-World Blurred Images Super-Resolution
ECCV 2024
3
citations
Walk Wisely on Graph: Knowledge Graph Reasoning with Dual Agents via Efficient Guidance-Exploration
AAAI 2025
2
citations
LLM4RSR: Large Language Models as Data Correctors for Robust Sequential Recommendation
AAAI 2025
2
citations
Stability and Generalization of Zeroth-Order Decentralized Stochastic Gradient Descent with Changing Topology
AAAI 2025
1
citations
Towards Ship License Plate Recognition in the Wild: A Large Benchmark and Strong Baseline
AAAI 2025
1
citations
Automatic Thumbnail Generation Based on Visual Representativeness and Foreground Recognizability
ICCV 2015
0
citations
Multi-Stage Multi-Recursive-Input Fully Convolutional Networks for Neuronal Boundary Detection
ICCV 2017arXiv
0
citations
LEA2: A Lightweight Ensemble Adversarial Attack via Non-overlapping Vulnerable Frequency Regions
ICCV 2023
0
citations
Few-Shot Physically-Aware Articulated Mesh Generation via Hierarchical Deformation
ICCV 2023arXiv
0
citations
Reconstructed Convolution Module Based Look-Up Tables for Efficient Image Super-Resolution
ICCV 2023arXiv
0
citations
Fan-Beam Binarization Difference Projection (FB-BDP): A Novel Local Object Descriptor for Fine-Grained Leaf Image Retrieval
ICCV 2023
0
citations
V3Det: Vast Vocabulary Visual Detection Dataset
ICCV 2023arXiv
0
citations
A Novel Line Integral Transform for 2D Affine-Invariant Shape Retrieval
ECCV 2020
0
citations
Robust Network Architecture Search via Feature Distortion Restraining
ECCV 2022
0
citations
GeoAug: Data Augmentation for Few-Shot NeRF with Geometry Constraints
ECCV 2022
0
citations
Filter Pruning via Feature Discrimination in Deep Neural Networks
ECCV 2022
0
citations
Beyond Single Images: Retrieval Self-Augmented Unsupervised Camouflaged Object Detection
ICCV 2025
0
citations
OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations
CVPR 2025
0
citations
Chimera: Improving Generalist Model with Domain-Specific Experts
ICCV 2025
0
citations
Shift the Lens: Environment-Aware Unsupervised Camouflaged Object Detection
CVPR 2025
0
citations
Hybrid Layout Control for Diffusion Transformer: Fewer Annotations, Superior Aesthetics
ICCV 2025
0
citations
OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation
ICCV 2025
0
citations
Spatiotemporal-aware Trend-Seasonality Decomposition Network for Traffic Flow Forecasting
AAAI 2025
0
citations
Reverse Distribution Based Video Moment Retrieval for Effective Bias Elimination
AAAI 2025
0
citations
IAA: Inner-Adaptor Architecture Empowers Frozen Large Language Model with Multimodal Capabilities
AAAI 2025
0
citations
W2P: Switching from Weak Supervision to Partial Supervision for Semantic Segmentation
AAAI 2024
0
citations
Distributed Bilevel Optimization with Communication Compression
ICML 2024
0
citations
Can Walking and Measuring Along Chord Bunches Better Describe Leaf Shapes?
CVPR 2017
0
citations
Graph Structured Network for Image-Text Matching
CVPR 2020arXiv
0
citations
Self-Supervised Video Representation Learning by Context and Motion Decoupling
CVPR 2021arXiv
0
citations
BCOT: A Markerless High-Precision 3D Object Tracking Benchmark
CVPR 2022arXiv
0
citations
Graph Geometry Interaction Learning
NeurIPS 2020
0
citations
Model-Based Reinforcement Learning via Imagination with Derived Memory
NeurIPS 2021
0
citations
DOMINO: Decomposed Mutual Information Optimization for Generalized Context in Meta-Reinforcement Learning
NeurIPS 2022
0
citations
Theoretically Guaranteed Bidirectional Data Rectification for Robust Sequential Recommendation
NeurIPS 2023
0
citations
Design from Policies: Conservative Test-Time Adaptation for Offline Policy Optimization
NeurIPS 2023
0
citations
EmbodiedGPT: Vision-Language Pre-Training via Embodied Chain of Thought
NeurIPS 2023
0
citations