Qi Zhang

74

Papers

279

Total Citations

Papers (74)

FINER: Flexible Spectral-bias Tuning in Implicit NEural Representation by Variable-periodic Activation Functions

Frequency Spectrum Is More Effective for Multimodal Representation and Fusion: A Multimodal Spectrum Rumor Detector

TSD-SR: One-Step Diffusion with Target Score Distillation for Real-World Image Super-Resolution

MindTuner: Cross-Subject Visual Decoding with Visual Fingerprint and Semantic Correction

Mani-GS: Gaussian Splatting Manipulation with Triangular Mesh

OpenRCA: Can Large Language Models Locate the Root Cause of Software Failures?

ConTex-Human: Free-View Rendering of Human from a Single Image with Texture-Consistent Synthesis

MTL-LoRA: Low-Rank Adaptation for Multi-Task Learning

Amplifier: Bringing Attention to Neglected Low-Energy Components in Time Series Forecasting

Autonomous Goal Detection and Cessation in Reinforcement Learning: A Case Study on Source Term Estimation

CoMPaSS: Enhancing Spatial Understanding in Text-to-Image Diffusion Models

Beyond Interpretability: The Gains of Feature Monosemanticity on Model Robustness

Large-Scale Non-convex Stochastic Constrained Distributionally Robust Optimization

A Learning Error Analysis for Structured Prediction with Approximate Inference

EvaLearn: Quantifying the Learning Capability and Efficiency of LLMs via Sequential Problem Solving

Mitigating Ambiguities in 3D Classification with Gaussian Splatting

Boosting Vision Semantic Density with Anatomy Normality Modeling for Medical Vision-language Pre-training

Position-Aware Guided Point Cloud Completion with CLIP Model

Text Diffusion with Reinforced Conditioning

Wills Aligner: Multi-Subject Collaborative Brain Visual Decoding

View Transformation Robustness for Multi-View 3D Object Reconstruction with Reconstruction Error-Guided View Selection

Ray-Space Projection Model for Light Field Camera

Context-Aware Attention Network for Image-Text Retrieval

Cross-View Cross-Scene Multi-View Crowd Counting

FENeRF: Face Editing in Neural Radiance Fields

Hallucinated Neural Radiance Fields in the Wild

Deblur-NeRF: Neural Radiance Fields From Blurry Images

Fine-Grained Face Swapping via Regional GAN Inversion

Local-to-Global Registration for Bundle-Adjusting Neural Radiance Fields

DINER: Disorder-Invariant Implicit Neural Representation

Wide-Angle Rectification via Content-Aware Conformal Mapping

Local Implicit Ray Function for Generalizable Radiance Field Representation

Inverting the Imaging Process by Learning an Implicit Camera Model

UV Volumes for Real-Time Rendering of Editable Free-View Human Performance

VL-Match: Enhancing Vision-Language Pretraining with Token-Level and Instance-Level Matching

SLAN: Self-Locator Aided Network for Vision-Language Understanding

Calibration-Free Multi-View Crowd Counting

PixelFolder: An Efficient Progressive Pixel Synthesis Network for Image Generation

Neural Color Operators for Sequential Image Retouching

Unifying Event Detection and Captioning as Sequence Generation via Pre-training

HDR-NeRF: High Dynamic Range Neural Radiance Fields

Generative Hard Example Augmentation for Semantic Point Cloud Segmentation

SU-RGS: Relightable 3D Gaussian Splatting from Sparse Views under Unconstrained Illuminations

BokehDiff: Neural Lens Blur with One-Step Diffusion

SEMPO: Lightweight Foundation Models for Time Series Forecasting

Alleviating Shifted Distribution in Human Preference Alignment through Meta-Learning

COSEE: Consistency-Oriented Signal-Based Early Exiting via Calibrated Sample Weighting Mechanism

A Pre-convolved Representation for Plug-and-Play Neural Illumination Fields

Selective Focus: Investigating Semantics Sensitivity in Post-training Quantization for Lane Detection

LLMEval: A Preliminary Study on How to Evaluate Large Language Models

GS-IR: 3D Gaussian Splatting for Inverse Rendering

HumanNorm: Learning Normal Diffusion Model for High-quality and Realistic 3D Human Generation

HumanRef: Single Image to 3D Human Generation via Reference-Guided Diffusion

Look Ahead or Look Around? A Theoretical Comparison Between Autoregressive and Masked Pretraining

MLIP: Efficient Multi-Perspective Language-Image Pretraining with Exhaustive Data Utilization

${\rm E}(3)$-Equivariant Actor-Critic Methods for Cooperative Multi-Agent Reinforcement Learning

Linear Alignment: A Closed-form Solution for Aligning Human Preferences without Tuning and Feedback

Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning

4D Light Field Superpixel and Segmentation

Dynamic Feature Learning for Partial Face Recognition

Wide-Area Crowd Counting via Ground-Plane Density Maps and Multi-View Fusion CNNs

Efficient Communication in Multi-Agent Reinforcement Learning via Variance Based Control

Succinct and Robust Multi-Agent Communication With Temporal Message Control

Spectral Temporal Graph Neural Network for Multivariate Time-series Forecasting

Outlier Suppression: Pushing the Limit of Low-bit Transformer Language Models

A Neural Corpus Indexer for Document Retrieval

How Mask Matters: Towards Theoretical Understandings of Masked Autoencoders

A Comprehensive Study on Text-attributed Graphs: Benchmarking and Rethinking

Model-enhanced Vector Index

Identifiable Contrastive Learning with Automatic Feature Importance Discovery

MG-ViT: A Multi-Granularity Method for Compact and Efficient Vision Transformers

FourierGNN: Rethinking Multivariate Time Series Forecasting from a Pure Graph Perspective

Frequency-domain MLPs are More Effective Learners in Time Series Forecasting

\ell_1,p-Norm Regularization: Error Bounds and Convergence Rate Analysis of First-Order Methods