Jing Zhang

109

Papers

1,784

Total Citations

2

Affiliations

Affiliations

Hefei University of TechnologyGent University-imec

Papers (109)

T2I-Adapter: Learning Adapters to Dig Out More Controllable Ability for Text-to-Image Diffusion

SurgicalSAM: Efficient Class Promptable Surgical Instrument Segmentation

A Semi-supervised Nighttime Dehazing Baseline with Spatial-Frequency Aware and Realistic Brightness Constraint

CityWalker: Learning Embodied Urban Navigation from Web-Scale Videos

XLRS-Bench: Could Your Multimodal LLMs Understand Extremely Large Ultra-High-Resolution Remote Sensing Imagery?

SimDistill: Simulated Multi-Modal Distillation for BEV 3D Object Detection

Object-Aware Adaptive-Positivity Learning for Audio-Visual Question Answering

Question Calibration and Multi-Hop Modeling for Temporal Question Answering

IMPUS: Image Morphing with Perceptually-Uniform Sampling Using Diffusion Models

Decomposing Semantic Shifts for Composed Image Retrieval

LUWA Dataset: Learning Lithic Use-Wear Analysis on Microscopic Images

RoMA: Scaling up Mamba-based Foundation Models for Remote Sensing

CAD-Coder: Text-to-CAD Generation with Chain-of-Thought and Geometric Reward

Probability Density Geodesics in Image Diffusion Latent Space

MOL-Mamba: Enhancing Molecular Representation with Structural & Electronic Insights

Adversarial Exploitation of Data Diversity Improves Visual Localization

MOCID: Motion Context and Displacement Information Learning for Moving Infrared Small Target Detection

Highly Imperceptible Black-Box Graph Injection Attacks with Reinforcement Learning

Multi-Modality Affinity Inference for Weakly Supervised 3D Semantic Segmentation

Data-Free Generalized Zero-Shot Learning

Adversarial Purification with the Manifold Hypothesis

Quantum-Inspired Neural Network with Runge-Kutta Method

LaViP: Language-Grounded Visual Prompting

ArGue: Attribute-Guided Prompt Tuning for Vision-Language Models

SVGDreamer: Text Guided SVG Generation with Diffusion Model

UniMix: Towards Domain Adaptive and Generalizable LiDAR Semantic Segmentation in Adverse Weather

OxyGenerator: Reconstructing Global Ocean Deoxygenation Over a Century with Deep Learning

Latent Optimal Paths by Gumbel Propagation for Variational Bayesian Dynamic Programming

Joint Geometrical and Statistical Alignment for Visual Domain Adaptation

Fast Haze Removal for Nighttime Image Using Maximum Reflectance Prior

Importance Weighted Adversarial Nets for Partial Domain Adaptation

Deep Unsupervised Saliency Detection: A Multiple Noisy Labeling Perspective

MirrorGAN: Learning Text-To-Image Generation by Redescription

Few-Shot Learning via Saliency-Guided Hallucination of Samples

ShieldNets: Defending Against Adversarial Attacks Using Probabilistic Adversarial Robustness

UC-Net: Uncertainty Inspired RGB-D Saliency Detection via Conditional Variational Autoencoders

Deep Degradation Prior for Low-Quality Image Classification

Weakly-Supervised Salient Object Detection via Scribble Annotations

Simultaneously Localize, Segment and Rank the Camouflaged Objects

Weakly Supervised Video Salient Object Detection

Uncertainty-Aware Joint Salient Object and Camouflaged Object Detection

3DJCG: A Unified Framework for Joint Dense Captioning and Visual Grounding on 3D Point Clouds

DearKD: Data-Efficient Early Knowledge Distillation for Vision Transformers

GMFlow: Learning Optical Flow via Global Matching

Recurrent Glimpse-Based Decoder for Detection With Transformer

Learning Affordance Grounding From Exocentric Images

ISNet: Shape Matters for Infrared Small Target Detection

RU-Net: Regularized Unrolling Network for Scene Graph Generation

FIBA: Frequency-Injection Based Backdoor Attack in Medical Image Analysis

Dynamic Focus-Aware Positional Queries for Semantic Segmentation

Explicit Boundary Guided Semi-Push-Pull Contrastive Learning for Supervised Anomaly Detection

Leverage Interactive Affinity for Affordance Learning

Modeling the Distributional Uncertainty for Salient Object Detection Models

CLAMP: Prompt-Based Contrastive Learning for Connecting Language and Animal Pose

DeepSolo: Let Transformer Decoder With Explicit Points Solo for Text Spotting

Decoupling Learning and Remembering: A Bilevel Memory Framework With Knowledge Projection for Task-Incremental Learning

Referring Image Matting

Deep Multiple-Attribute-Perceived Network for Real-World Texture Recognition

Out-of-Boundary View Synthesis Towards Full-Frame Video Stabilization

RGB-D Saliency Detection via Cascaded Mutual Information Minimization

LPFF: A Portrait Dataset for Face Generators Across Large Poses

Domain Specified Optimization for Deployment Authorization

Sensitivity-Aware Visual Parameter-Efficient Fine-Tuning

RPEFlow: Multimodal Fusion of RGB-PointCloud-Event for Joint Optical Flow and Scene Flow Estimation

Multimodal Variational Auto-encoder based Audio-Visual Segmentation

ESSAformer: Efficient Transformer for Hyperspectral Image Super-resolution

Model Calibration in Dense Classification with Adaptive Label Perturbation

Learning Noise-Aware Encoder-Decoder from Noisy Labels by Alternating Back-Propagation for Saliency Detection

MeshMAE: Masked Autoencoders for 3D Mesh Data Analysis

Towards Data-Efficient Detection Transformers

ReAct: Temporal Action Detection with Relational Queries

FakeCLR: Exploring Contrastive Learning for Solving Latent Discontinuity in Data-Efficient GANs

VSA: Learning Varied-Size Window Attention in Vision Transformers

PolyphonicFormer: Unified Query Learning for Depth-Aware Video Panoptic Segmentation

Improving RGB-D Point Cloud Registration by Learning Multi-Scale Local Linear Transformation

RegionCL: Exploring Contrastive Region Pairs for Self-Supervised Representation Learning

BMD: A General Class-Balanced Multicentric Dynamic Prototype Strategy for Source-Free Domain Adaptation

Audio—Visual Segmentation

"Towards Scale-Aware, Robust, and Generalizable Unsupervised Monocular Depth Estimation by Integrating IMU Motion Dynamics"

"JPerceiver: Joint Perception Network for Depth, Pose and Layout Estimation in Driving Scenes"

P2C: Self-Supervised Point Cloud Completion from Single Partial Clouds

CARE Transformer: Mobile-Friendly Linear Visual Transformer via Decoupled Dual Interaction

SAIST: Segment Any Infrared Small Target Model Guided by Contrastive Language-Image Pretraining

Empowering LLMs to Understand and Generate Complex Vector Graphics

Brain-Inspired Spiking Neural Networks for Energy-Efficient Object Detection

Identifying and Mitigating Position Bias of Multi-image Vision-Language Models

GARF: Learning Generalizable 3D Reassembly for Real-World Fractures

Harnessing Massive Satellite Imagery with Efficient Masked Image Modeling

Synergistic Prompting for Robust Visual Recognition with Missing Modalities

What Makes for Text to 360-degree Panorama Generation with Stable Diffusion?

Rethink Sparse Signals for Pose-guided Text-to-image Generation

ATCTrack: Aligning Target-Context Cues with Dynamic Target States for Robust Vision-Language Tracking

Patch-level Sounding Object Tracking for Audio-Visual Question Answering

Multi-axis Prompt and Multi-dimension Fusion Network for All-in-one Weather-degraded Image Restoration

UAWTrack: Universal 3D Single Object Tracking in Adverse Weather

Semi-supervised Infrared Small Target Detection with Thermodynamic-Inspired Uneven Perturbation and Confidence Adaptation

Category Anchor-Guided Unsupervised Domain Adaptation for Semantic Segmentation

Learn, Imagine and Create: Text-to-Image Generation from Prior Knowledge

Auto Learning Attention

Learning Generative Vision Transformer with Energy-Based Latent Space for Saliency Prediction

ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias

Watermarking for Out-of-distribution Detection

Exploring Figure-Ground Assignment Mechanism in Perceptual Organization

APT-36K: A Large-scale Benchmark for Animal Pose Estimation and Tracking

SCL-WC: Cross-Slide Contrastive Learning for Weakly-Supervised Whole-Slide Image Classification

ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation

Constrained Policy Optimization with Explicit Behavior Density For Offline Reinforcement Learning

SAMRS: Scaling-up Remote Sensing Segmentation Dataset with Segment Anything Model

DiffSketcher: Text Guided Vector Sketch Synthesis through Latent Diffusion Models