Yang Liu
112
Papers
1,240
Total Citations
2
Affiliations
Affiliations
school of computer science and technologyHarbin institute of technology
Papers (112)
OpenChat: Advancing Open-source Language Models with Mixed-Quality Data
ICLR 2024
309
citations
EvalCrafter: Benchmarking and Evaluating Large Video Generation Models
CVPR 2024
237
citations
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
ICLR 2025
115
citations
Space Group Constrained Crystal Generation
ICLR 2024
60
citations
Magic Tokens: Select Diverse Tokens for Multi-modal Object Re-Identification
CVPR 2024
49
citations
Fantastic Animals and Where to Find Them: Segment Any Marine Animal with Dual SAM
CVPR 2024
32
citations
Exploring Enhanced Contextual Information for Video-Level Object Tracking
AAAI 2025
27
citations
DeMo: Decoupled Feature-Based Mixture of Experts for Multi-Modal Object Re-Identification
AAAI 2025
27
citations
Bridging the Gap between 2D and 3D Visual Question Answering: A Fusion Approach for 3D VQA
AAAI 2024arXiv
26
citations
Unmasking and Improving Data Credibility: A Study with Datasets for Training Harmless Language Models
ICLR 2024
26
citations
Perception-Guided Jailbreak Against Text-to-Image Models
AAAI 2025
26
citations
Token Cleaning: Fine-Grained Data Selection for LLM Supervised Fine-Tuning
ICML 2025
22
citations
An Upload-Efficient Scheme for Transferring Knowledge From a Server-Side Pre-trained Generator to Clients in Heterogeneous Federated Learning
CVPR 2024
20
citations
FedFixer: Mitigating Heterogeneous Label Noise in Federated Learning
AAAI 2024arXiv
19
citations
SSR: Enhancing Depth Perception in Vision-Language Models via Rationale-Guided Spatial Reasoning
NeurIPS 2025
18
citations
FaceChain-ImagineID: Freely Crafting High-Fidelity Diverse Talking Faces from Disentangled Audio
CVPR 2024
18
citations
De-confounded Data-free Knowledge Distillation for Handling Distribution Shifts
CVPR 2024
17
citations
ACC-Collab: An Actor-Critic Approach to Multi-Agent LLM Collaboration
ICLR 2025
13
citations
Performative Federated Learning: A Solution to Model-Dependent and Heterogeneous Distribution Shifts
AAAI 2024arXiv
12
citations
Adversarial Attacks against Closed-Source MLLMs via Feature Optimal Alignment
NeurIPS 2025
12
citations
ZeroFlow: Scalable Scene Flow via Distillation
ICLR 2024
12
citations
SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories
CVPR 2025
11
citations
When the Future Becomes the Past: Taming Temporal Correspondence for Self-supervised Video Representation Learning
CVPR 2025
10
citations
Post-hoc bias scoring is optimal for fair classification
ICLR 2024
10
citations
SceneTAP: Scene-Coherent Typographic Adversarial Planner against Vision-Language Models in Real-World Environments
CVPR 2025
9
citations
Active Object Detection with Knowledge Aggregation and Distillation from Large Models
CVPR 2024
9
citations
Cross-modal Causal Relation Alignment for Video Question Grounding
CVPR 2025
7
citations
Contrastive Private Data Synthesis via Weighted Multi-PLM Fusion
ICML 2025
7
citations
ConMo: Controllable Motion Disentanglement and Recomposition for Zero-Shot Motion Transfer
CVPR 2025
7
citations
Dynamic Graph Learning with Static Relations for Credit Risk Assessment
AAAI 2025
6
citations
Asymmetric Visual Semantic Embedding Framework for Efficient Vision-Language Alignment
AAAI 2025
6
citations
Multifaceted User Modeling in Recommendation: A Federated Foundation Models Approach
AAAI 2025
6
citations
Novel Class Discovery in Chest X-rays via Paired Images and Text
AAAI 2024
5
citations
Robust Evaluation Measures for Evaluating Social Biases in Masked Language Models
AAAI 2024arXiv
5
citations
Adversarial Robust Memory-Based Continual Learner
ICCV 2025
5
citations
AdaMMS: Model Merging for Heterogeneous Multimodal Large Language Models with Unsupervised Coefficient Optimization
CVPR 2025
4
citations
Apply Hierarchical-Chain-of-Generation to Complex Attributes Text-to-3D Generation
CVPR 2025
4
citations
VideoLLaMB: Long Streaming Video Understanding with Recurrent Memory Bridges
ICCV 2025
3
citations
CoSpace: Benchmarking Continuous Space Perception Ability for Vision-Language Models
CVPR 2025
3
citations
Learning Dynamic Similarity by Bidirectional Hierarchical Sliding Semantic Probe for Efficient Text Video Retrieval
AAAI 2025
3
citations
Logic-Q: Improving Deep Reinforcement Learning-based Quantitative Trading via Program Sketch-based Tuning
AAAI 2025
3
citations
WeakMCN: Multi-task Collaborative Network for Weakly Supervised Referring Expression Comprehension and Segmentation
CVPR 2025
3
citations
Hybrid Concept Bottleneck Models
CVPR 2025
2
citations
FinMMR: Make Financial Numerical Reasoning More Multimodal, Comprehensive, and Challenging
ICCV 2025arXiv
2
citations
AdsQA: Towards Advertisement Video Understanding
ICCV 2025
2
citations
Equilibrium Policy Generalization: A Reinforcement Learning Framework for Cross-Graph Zero-Shot Generalization in Pursuit-Evasion Games
NeurIPS 2025
2
citations
Towards Minimizing Feature Drift in Model Merging: Layer-wise Task Vector Fusion for Adaptive Knowledge Integration
NeurIPS 2025
2
citations
InversionGNN: A Dual Path Network for Multi-Property Molecular Optimization
ICLR 2025
1
citations
Exploring Structural Degradation in Dense Representations for Self-supervised Learning
NeurIPS 2025
1
citations
Fair Participation via Sequential Policies
AAAI 2024
1
citations
Learning Counterfactual Outcomes Under Rank Preservation
NeurIPS 2025
1
citations
HSI: A Holistic Style Injector for Arbitrary Style Transfer
CVPR 2025
1
citations
Jointly Modeling Spatio-Temporal Features of Tactile Signals for Action Classification
AAAI 2024arXiv
1
citations
UniSim: A Unified Simulator for Time-Coarsened Dynamics of Biomolecules
ICML 2025
1
citations
MM-OPERA: Benchmarking Open-ended Association Reasoning for Large Vision-Language Models
NeurIPS 2025
0
citations
Human and AI Perceptual Differences in Image Classification Errors
AAAI 2025
0
citations
Mesh Interpolation Graph Network for Dynamic and Spatially Irregular Global Weather Forecasting
NeurIPS 2025
0
citations
FastPERT: Towards Fast Microservice Application Latency Prediction via Structural Inductive Bias over PERT Networks
AAAI 2025
0
citations
Can Large Language Models Derive High-Level Cognition from Low-Level and Fragmented Foundational Information?
AAAI 2025
0
citations
S^3cMath: Spontaneous Step-Level Self-Correction Makes Large Language Models Better Mathematical Reasoners
AAAI 2025
0
citations
Unified Open-World Segmentation with Multi-Modal Prompts
ICCV 2025
0
citations
FedTGP: Trainable Global Prototypes with Adaptive-Margin-Enhanced Contrastive Learning for
AAAI 2024
0
citations
LVAgent: Long Video Understanding by Multi-Round Dynamical Collaboration of MLLM Agents
ICCV 2025
0
citations
Comprehensive Visual Grounding for Video Description
AAAI 2024
0
citations
FedMut: Generalized Federated Learning via Stochastic Mutation
AAAI 2024
0
citations
Knowledge Graph Error Detection with Contrastive Confidence Adaption
AAAI 2024
0
citations
How Do Multimodal Large Language Models Handle Complex Multimodal Reasoning? Placing Them in An Extensible Escape Game
ICCV 2025
0
citations
Multi-scenario Overlapping Text Segmentation with Depth Awareness
ICCV 2025
0
citations
End-to-End Driving with Online Trajectory Evaluation via BEV World Model
ICCV 2025
0
citations
Semantic-Guided Novel Category Discovery
AAAI 2024
0
citations
DiffTell: A High-Quality Dataset for Describing Image Manipulation Changes
ICCV 2025
0
citations
Open-Vocabulary HOI Detection with Interaction-aware Prompt and Concept Calibration
ICCV 2025
0
citations
DiverGen: Improving Instance Segmentation by Learning Wider Data Distribution with More Diverse Generative Data
CVPR 2024
0
citations
Diff-BGM: A Diffusion Model for Video Background Music Generation
CVPR 2024
0
citations
OED: Towards One-stage End-to-End Dynamic Scene Graph Generation
CVPR 2024
0
citations
DisTime: Distribution-based Time Representation for Video Large Language Models
ICCV 2025
0
citations
Aligning Information Capacity Between Vision and Language via Dense-to-Sparse Feature Distillation for Image-Text matching
ICCV 2025
0
citations
Exploring the Potential of Large Foundation Models for Open-Vocabulary HOI Detection
CVPR 2024
0
citations
CosalPure: Learning Concept from Group Images for Robust Co-Saliency Detection
CVPR 2024
0
citations
A&B BNN: Add&Bit-Operation-Only Hardware-Friendly Binary Neural Network
CVPR 2024
0
citations
Hierarchical Event Memory for Accurate and Low-latency Online Video Temporal Grounding
ICCV 2025arXiv
0
citations
GFPack++: Attention-Driven Gradient Fields for Optimizing 2D Irregular Packing
ICCV 2025
0
citations
TRKT: Weakly Supervised Dynamic Scene Graph Generation with Temporal-enhanced Relation-aware Knowledge Transferring
ICCV 2025
0
citations
CoDi-2: In-Context Interleaved and Interactive Any-to-Any Generation
CVPR 2024
0
citations
EgoThink: Evaluating First-Person Perspective Thinking Capability of Vision-Language Models
CVPR 2024
0
citations
Beyond the Destination: A Novel Benchmark for Exploration-Aware Embodied Question Answering
ICCV 2025
0
citations
AR-VRM: Imitating Human Motions for Visual Robot Manipulation with Analogical Reasoning
ICCV 2025
0
citations
ToolVQA: A Dataset for Multi-step Reasoning VQA with External Tools
ICCV 2025
0
citations
Learning Visual Proxy for Compositional Zero-Shot Learning
ICCV 2025
0
citations
ROS-SAM: High-Quality Interactive Segmentation for Remote Sensing Moving Object
CVPR 2025
0
citations
Towards Long-Horizon Vision-Language Navigation: Platform, Benchmark and Method
CVPR 2025
0
citations
Zero-Shot Cyclic Peptide Design via Composable Geometric Constraints
ICML 2025
0
citations
DSPNet: Dual-vision Scene Perception for Robust 3D Question Answering
CVPR 2025
0
citations
Position: Towards Unified Alignment Between Agents, Humans, and Environment
ICML 2024
0
citations
Performative Prediction with Bandit Feedback: Learning through Reparameterization
ICML 2024
0
citations
Multi-View Clustering by Inter-cluster Connectivity Guided Reward
ICML 2024
0
citations
Generalist Equivariant Transformer Towards 3D Molecular Interaction Learning
ICML 2024
0
citations
Graph Distillation with Eigenbasis Matching
ICML 2024
0
citations
Semantic-Aware Human Object Interaction Image Generation
ICML 2024
0
citations
MLIP: Efficient Multi-Perspective Language-Image Pretraining with Exhaustive Data Utilization
ICML 2024
0
citations
Neural Jump-Diffusion Temporal Point Processes
ICML 2024
0
citations
Generative Active Learning for Long-tailed Instance Segmentation
ICML 2024
0
citations
Envisioning Outlier Exposure by Large Language Models for Out-of-Distribution Detection
ICML 2024
0
citations
Equivariant Diffusion for Crystal Structure Prediction
ICML 2024
0
citations
DOF-Separation for 3D Manipulation in XR: Understanding Finger-Wrist Separation to Simultaneously Translate and Rotate Objects
ISMAR 2025
0
citations
Improving Neural Logic Machines via Failure Reflection
ICML 2024
0
citations
DoGA: Enhancing Grounded Object Detection via Grouped Pre-Training with Attributes
AAAI 2025
0
citations
Cross-Subject Cognitive Load Recognition in VR Using Multimodal Fusion with EEG and Eye-Tracking
ISMAR 2025
0
citations
Generative Video Diffusion for Unseen Novel Semantic Video Moment Retrieval
AAAI 2025
0
citations
From Coarse to Fine: A Matching and Alignment Framework for Unsupervised Cross-View Geo-Localization
AAAI 2025
0
citations
MambaPro: Multi-Modal Object Re-identification with Mamba Aggregation and Synergistic Prompt
AAAI 2025
0
citations
PlanLLM: Video Procedure Planning with Refinable Large Language Models
AAAI 2025
0
citations