Ming-Ming Cheng

86

Papers

633

Total Citations

Papers (86)

Deep Hough Transform for Semantic Line Detection

Highly Efficient Salient Object Detection with 100K Parameters

DFormer: Rethinking RGBD Representation Learning for Semantic Segmentation

Cascade-CLIP: Cascaded Vision-Language Embeddings Alignment for Zero-Shot Semantic Segmentation

VisualCloze: A Universal Image Generation Framework via Visual In-Context Learning

TeMO: Towards Text-Driven 3D Stylization for Multi-Object Meshes

Fine-Grained Knowledge Selection and Restoration for Non-exemplar Class Incremental Learning

From Words to Worth: Newborn Article Impact Prediction with LLM

Task-Adaptive Saliency Guidance for Exemplar-free Class Incremental Learning

DIPO: Dual-State Images Controlled Articulated Object Generation Powered by Diverse Data

TempSamp-R1: Effective Temporal Sampling with Reinforcement Fine-Tuning for Video LLMs

Unbiased Region-Language Alignment for Open-Vocabulary Dense Prediction

KAC: Kolmogorov-Arnold Classifier for Continual Learning

Towards RAW Object Detection in Diverse Conditions

Re-Aligning Language to Visual Objects with an Agentic Workflow

DISTA-Net: Dynamic Closely-Spaced Infrared Small Target Unmixing

Deeply Supervised Salient Object Detection With Short Connections

GMS: Grid-based Motion Statistics for Fast, Ultra-Robust Feature Correspondence

Revisiting Video Saliency: A Large-Scale Benchmark and a New Model

Crowd Counting With Deep Negative Correlation Learning

RegularFace: Deep Face Recognition via Exclusive Regularization

Multi-Level Context Ultra-Aggregation for Stereo Matching

A Simple Pooling-Based Design for Real-Time Salient Object Detection

Contrast Prior and Fluid Pyramid Integration for RGBD Salient Object Detection

An Iterative and Cooperative Top-Down and Bottom-Up Inference Network for Salient Object Detection

S4Net: Single Stage Salient-Instance Segmentation

Shifting More Attention to Video Salient Object Detection

IP102: A Large-Scale Benchmark Dataset for Insect Pest Recognition

Camouflaged Object Detection

Taking a Deeper Look at Co-Salient Object Detection

Rethinking Computer-Aided Tuberculosis Diagnosis

Strip Pooling: Rethinking Spatial Pooling for Scene Parsing

VecRoad: Point-Based Iterative Graph Exploration for Road Graphs Extraction

Interactive Image Segmentation With First Click Attention

Improving Convolutional Networks With Self-Calibrated Convolutions

DOTS: Decoupling Operation and Topology in Differentiable Architecture Search

Temporal Modulation Network for Controllable Space-Time Video Super-Resolution

Representative Batch Normalization With Feature Calibration

Global2Local: Efficient Structure Search for Video Action Segmentation

FocusCut: Diving Into a Focus View in Interactive Segmentation

Representation Compensation Networks for Continual Semantic Segmentation

Towards an End-to-End Framework for Flow-Guided Video Inpainting

Localization Distillation for Dense Object Detection

Multi-Space Neural Radiance Fields

Endpoints Weight Fusion for Class Incremental Semantic Segmentation

Looking Through the Glass: Neural Surface Reconstruction Against High Specular Reflections

AMT: All-Pairs Multi-Field Transforms for Efficient Frame Interpolation

Structure-Measure: A New Way to Evaluate Foreground Maps

Zero-Shot Emotion Recognition via Affective Structural Embedding

Integral Object Mining via Online Attention Accumulation

Scoot: A Perceptual Metric for Facial Sketches

DFormerv2: Geometry Self-Attention for RGBD Semantic Segmentation

Optimizing the F-Measure for Threshold-Free Salient Object Detection

Image Inpainting With Learnable Bidirectional Attention Maps

Joint Acne Image Grading and Counting via Label Distribution Learning

Personalized Image Semantic Segmentation

iNAS: Integral NAS for Device-Aware Salient Object Detection

Masked Autoencoders are Efficient Class Incremental Learners

Masked Diffusion Transformer is a Strong Image Synthesizer

SLAN: Self-Locator Aided Network for Vision-Language Understanding

Large Selective Kernel Network for Remote Sensing Object Detection

SRFormer: Permuted Self-Attention for Single Image Super-Resolution

Gradient-Induced Co-Saliency Detection

VQFR: Blind Face Restoration with Vector-Quantized Dictionary and Parallel Decoder

"Restore Globally, Refine Locally: A Mask-Guided Scheme to Accelerate Super-Resolution Networks"

Long-Tailed Class Incremental Learning

EGNet: Edge Guidance Network for Salient Object Detection

GET: Unlocking the Multi-modal Potential of CLIP for Generalized Category Discovery

RSAR: Restricted State Angle Resolver and Rotated SAR Benchmark

TAR3D: Creating High-Quality 3D Assets via Next-Part Prediction

Anchor Token Matching: Implicit Structure Locking for Training-free AR Image Editing

Revisiting Efficient Semantic Segmentation: Learning Offsets for Better Spatial and Class Feature Alignment

Advancing Textual Prompt Learning with Anchored Attributes

AR-1-to-3: Single Image to Consistent 3D Object via Next-View Prediction

Knowledge Graph Enhanced Generative Multi-modal Models for Class-Incremental Learning

CorrMatch: Label Propagation via Correlation Matching for Semi-Supervised Semantic Segmentation

CrossKD: Cross-Head Knowledge Distillation for Object Detection

Traffic Scene Parsing through the TSP6K Dataset

PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding

Generative Multi-modal Models are Good Class Incremental Learners

Object Region Mining With Adversarial Erasing: A Simple Classification to Semantic Segmentation Approach

Richer Convolutional Features for Edge Detection

Self-Erasing Network for Integral Object Attention

Unsupervised Scale-consistent Depth and Ego-motion Learning from Monocular Video

ICNet: Intra-saliency Correlation Network for Co-Saliency Detection

SegNeXt: Rethinking Convolutional Attention Design for Semantic Segmentation