Jing Zhang

43

Papers

1,784

Total Citations

2

Affiliations

Affiliations

Hefei University of TechnologyGent University-imec

Papers (43)

T2I-Adapter: Learning Adapters to Dig Out More Controllable Ability for Text-to-Image Diffusion

SurgicalSAM: Efficient Class Promptable Surgical Instrument Segmentation

A Semi-supervised Nighttime Dehazing Baseline with Spatial-Frequency Aware and Realistic Brightness Constraint

CityWalker: Learning Embodied Urban Navigation from Web-Scale Videos

SimDistill: Simulated Multi-Modal Distillation for BEV 3D Object Detection

XLRS-Bench: Could Your Multimodal LLMs Understand Extremely Large Ultra-High-Resolution Remote Sensing Imagery?

Object-Aware Adaptive-Positivity Learning for Audio-Visual Question Answering

Question Calibration and Multi-Hop Modeling for Temporal Question Answering

IMPUS: Image Morphing with Perceptually-Uniform Sampling Using Diffusion Models

Decomposing Semantic Shifts for Composed Image Retrieval

RoMA: Scaling up Mamba-based Foundation Models for Remote Sensing

LUWA Dataset: Learning Lithic Use-Wear Analysis on Microscopic Images

CAD-Coder: Text-to-CAD Generation with Chain-of-Thought and Geometric Reward

Probability Density Geodesics in Image Diffusion Latent Space

MOL-Mamba: Enhancing Molecular Representation with Structural & Electronic Insights

Adversarial Exploitation of Data Diversity Improves Visual Localization

MOCID: Motion Context and Displacement Information Learning for Moving Infrared Small Target Detection

Highly Imperceptible Black-Box Graph Injection Attacks with Reinforcement Learning

Multi-Modality Affinity Inference for Weakly Supervised 3D Semantic Segmentation

Data-Free Generalized Zero-Shot Learning

Adversarial Purification with the Manifold Hypothesis

Quantum-Inspired Neural Network with Runge-Kutta Method

LaViP: Language-Grounded Visual Prompting

ArGue: Attribute-Guided Prompt Tuning for Vision-Language Models

SVGDreamer: Text Guided SVG Generation with Diffusion Model

UniMix: Towards Domain Adaptive and Generalizable LiDAR Semantic Segmentation in Adverse Weather

OxyGenerator: Reconstructing Global Ocean Deoxygenation Over a Century with Deep Learning

Latent Optimal Paths by Gumbel Propagation for Variational Bayesian Dynamic Programming

CARE Transformer: Mobile-Friendly Linear Visual Transformer via Decoupled Dual Interaction

SAIST: Segment Any Infrared Small Target Model Guided by Contrastive Language-Image Pretraining

Empowering LLMs to Understand and Generate Complex Vector Graphics

Brain-Inspired Spiking Neural Networks for Energy-Efficient Object Detection

Identifying and Mitigating Position Bias of Multi-image Vision-Language Models

GARF: Learning Generalizable 3D Reassembly for Real-World Fractures

Harnessing Massive Satellite Imagery with Efficient Masked Image Modeling

Synergistic Prompting for Robust Visual Recognition with Missing Modalities

What Makes for Text to 360-degree Panorama Generation with Stable Diffusion?

Rethink Sparse Signals for Pose-guided Text-to-image Generation

ATCTrack: Aligning Target-Context Cues with Dynamic Target States for Robust Vision-Language Tracking

Patch-level Sounding Object Tracking for Audio-Visual Question Answering

Multi-axis Prompt and Multi-dimension Fusion Network for All-in-one Weather-degraded Image Restoration

UAWTrack: Universal 3D Single Object Tracking in Adverse Weather

Semi-supervised Infrared Small Target Detection with Thermodynamic-Inspired Uneven Perturbation and Confidence Adaptation