Jing Wang

31

Papers

107

Total Citations

Papers (31)

WISA: World simulator assistant for physics-aware text-to-video generation

Adaptive FSS: A Novel Few-Shot Segmentation Framework via Prototype Enhancement

SURER: Structure-Adaptive Unified Graph Neural Network for Multi-View Clustering

Online Video Understanding: OVBench and VideoChat-Online

SWEA: Updating Factual Knowledge in Large Language Models via Subject Word Embedding Altering

What Has Been Overlooked in Contrastive Source-Free Domain Adaptation: Leveraging Source-Informed Latent Augmentation within Neighborhood Context

Lay2Story: Extending Diffusion Transformers for Layout-Togglable Story Generation

StreamForest: Efficient Online Video Understanding with Persistent Event Memory

PatchVSR: Breaking Video Diffusion Resolution Limits with Patch-wise Video Super-Resolution

CoDTS: Enhancing Sparsely Supervised Collaborative Perception with a Dual Teacher-Student Framework

Exploring Active Learning in Meta-Learning: Enhancing Context Set Labeling

AdaptCMVC: Robust Adaption to Incremental Views in Continual Multi-view Clustering

An End-to-End Robust Point Cloud Semantic Segmentation Network with Single-Step Conditional Diffusion Models

SAMPLE: Semantic Alignment through Temporal-Adaptive Multimodal Prompt Learning for Event-Based Open-Vocabulary Action Recognition

MCAM: Multimodal Causal Analysis Model for Ego-Vehicle-Level Driving Video Understanding

Redefining <Creative> in Dictionary: Towards an Enhanced Semantic Understanding of Creative Generation

Prior-aware Dynamic Temporal Modeling Framework for Sequential 3D Hand Pose Estimation

WAVE: Weight Templates for Adaptive Initialization of Variable-sized Models

Learning with Adaptive Resource Allocation

Handling Heterogeneous Curvatures in Bandit LQR Control

Walk and Learn: Facial Attribute Representation Learning From Egocentric Video and Contextual Data

Learning To Filter: Siamese Relation Network for Robust Tracking

Improving OCR-Based Image Captioning by Incorporating Geometrical Relationship

Scene Text Retrieval via Joint Text Detection and Similarity Learning

Asymmetric Gained Deep Image Compression With Continuous Rate Adaptation

From Two to One: A New Scene Text Recognizer With Visual Language Modeling Network

AlphaVC: High-Performance and Efficient Learned Video Compression

Content-Oriented Learned Image Compression

Med-DANet: Dynamic Architecture Network for Efficient Medical Volumetric Segmentation

Detecting Tampered Scene Text in the Wild

Provable Variable Selection for Streaming Features