Zhang
279
Papers
8,842
Total Citations
Papers (279)
Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection
ECCV 2024
3,368
citations
MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?
ECCV 2024arXiv
473
citations
OpenHands: An Open Platform for AI Software Developers as Generalist Agents
ICLR 2025arXiv
351
citations
Evaluating Text-to-Visual Generation with Image-to-Text Generation
ECCV 2024arXiv
347
citations
Segment and Recognize Anything at Any Granularity
ECCV 2024arXiv
226
citations
CityGaussian: Real-time High-quality Large-Scale Scene Rendering with Gaussians
ECCV 2024arXiv
180
citations
MaskGCT: Zero-Shot Text-to-Speech with Masked Generative Codec Transformer
ICLR 2025arXiv
156
citations
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines
NeurIPS 2025arXiv
118
citations
LLaVA-Grounding: Grounded Visual Chat with Large Multimodal Models
ECCV 2024arXiv
114
citations
IRSAM: Advancing Segment Anything Model for Infrared Small Target Detection
ECCV 2024arXiv
110
citations
MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers
ICLR 2025arXiv
101
citations
SWE-bench Multimodal: Do AI Systems Generalize to Visual Software Domains?
ICLR 2025arXiv
101
citations
MoBA: Mixture of Block Attention for Long-Context LLMs
NeurIPS 2025arXiv
94
citations
T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT
NeurIPS 2025arXiv
91
citations
CoR-GS: Sparse-View 3D Gaussian Splatting via Co-Regularization
ECCV 2024arXiv
90
citations
SVDQuant: Absorbing Outliers by Low-Rank Component for 4-Bit Diffusion Models
ICLR 2025arXiv
90
citations
PSALM: Pixelwise Segmentation with Large Multi-modal Model
ECCV 2024arXiv
82
citations
WebDancer: Towards Autonomous Information Seeking Agency
NeurIPS 2025arXiv
81
citations
Speculative RAG: Enhancing Retrieval Augmented Generation through Drafting
ICLR 2025arXiv
75
citations
MMTEB: Massive Multilingual Text Embedding Benchmark
ICLR 2025arXiv
74
citations
APIGen-MT: Agentic Pipeline for Multi-Turn Data Generation via Simulated Agent-Human Interplay
NeurIPS 2025arXiv
71
citations
MagicPIG: LSH Sampling for Efficient LLM Generation
ICLR 2025arXiv
62
citations
MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?
NeurIPS 2025arXiv
57
citations
Self-Improvement in Language Models: The Sharpening Mechanism
ICLR 2025arXiv
55
citations
MindSearch: Mimicking Human Minds Elicits Deep AI Searcher
ICLR 2025arXiv
53
citations
RoboRefer: Towards Spatial Referring with Reasoning in Vision-Language Models for Robotics
NeurIPS 2025arXiv
51
citations
OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
ICLR 2025arXiv
48
citations
RangeLDM: Fast Realistic LiDAR Point Cloud Generation
ECCV 2024arXiv
44
citations
Catastrophic Failure of LLM Unlearning via Quantization
ICLR 2025arXiv
43
citations
To Code or Not To Code? Exploring Impact of Code in Pre-training
ICLR 2025arXiv
40
citations
Stream Query Denoising for Vectorized HD-Map Construction
ECCV 2024arXiv
40
citations
On the Role of Attention Heads in Large Language Model Safety
ICLR 2025arXiv
40
citations
SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference Acceleration
ICLR 2025arXiv
39
citations
Agentic RL Scaling Law: Spontaneous Code Execution for Mathematical Problem Solving
NeurIPS 2025
38
citations
Reconstructive Visual Instruction Tuning
ICLR 2025arXiv
34
citations
Generalizable Human Gaussians for Sparse View Synthesis
ECCV 2024arXiv
34
citations
LLaVA-MoD: Making LLaVA Tiny via MoE-Knowledge Distillation
ICLR 2025arXiv
34
citations
KungfuBot: Physics-Based Humanoid Whole-Body Control for Learning Highly-Dynamic Skills
NeurIPS 2025arXiv
31
citations
EfficientVLA: Training-Free Acceleration and Compression for Vision-Language-Action Models
NeurIPS 2025arXiv
31
citations
PivotMesh: Generic 3D Mesh Generation via Pivot Vertices Guidance
ICLR 2025arXiv
31
citations
Soft Prompt Generation for Domain Generalization
ECCV 2024arXiv
30
citations
Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation
NeurIPS 2025arXiv
30
citations
Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos
NeurIPS 2025arXiv
29
citations
GOFA: A Generative One-For-All Model for Joint Graph Language Modeling
ICLR 2025arXiv
28
citations
When Thinking Fails: The Pitfalls of Reasoning for Instruction-Following in LLMs
NeurIPS 2025arXiv
28
citations
Policy Decorator: Model-Agnostic Online Refinement for Large Policy Model
ICLR 2025arXiv
27
citations
Fast-in-Slow: A Dual-System VLA Model Unifying Fast Manipulation within Slow Reasoning
NeurIPS 2025
27
citations
I-MedSAM: Implicit Medical Image Segmentation with Segment Anything
ECCV 2024arXiv
26
citations
Delving into RL for Image Generation with CoT: A Study on DPO vs. GRPO
NeurIPS 2025arXiv
25
citations
EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Videos Generation
NeurIPS 2025arXiv
25
citations
Self-supervised Feature Adaptation for 3D Industrial Anomaly Detection
ECCV 2024arXiv
24
citations
Energy-Weighted Flow Matching for Offline Reinforcement Learning
ICLR 2025arXiv
24
citations
MMQA: Evaluating LLMs with Multi-Table Multi-Hop Complex Questions
ICLR 2025
23
citations
Language Imbalance Driven Rewarding for Multilingual Self-improving
ICLR 2025arXiv
23
citations
SWE-bench Goes Live!
NeurIPS 2025arXiv
22
citations
Towards General-Purpose Model-Free Reinforcement Learning
ICLR 2025arXiv
22
citations
Audio Large Language Models Can Be Descriptive Speech Quality Evaluators
ICLR 2025arXiv
22
citations
SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond
NeurIPS 2025arXiv
21
citations
One-Shot Diffusion Mimicker for Handwritten Text Generation
ECCV 2024arXiv
21
citations
An Incremental Unified Framework for Small Defect Inspection
ECCV 2024arXiv
21
citations
SpikeLLM: Scaling up Spiking Neural Network to Large Language Models via Saliency-based Spiking
ICLR 2025arXiv
21
citations
GI-GS: Global Illumination Decomposition on Gaussian Splatting for Inverse Rendering
ICLR 2025arXiv
21
citations
TwinMarket: A Scalable Behavioral and Social Simulation for Financial Markets
NeurIPS 2025arXiv
19
citations
Any2Point: Empowering Any-modality Transformers for Efficient 3D Understanding
ECCV 2024
19
citations
SeRL: Self-play Reinforcement Learning for Large Language Models with Limited Data
NeurIPS 2025arXiv
19
citations
GameArena: Evaluating LLM Reasoning through Live Computer Games
ICLR 2025arXiv
19
citations
Implicit Concept Removal of Diffusion Models
ECCV 2024arXiv
18
citations
SELF-EVOLVED REWARD LEARNING FOR LLMS
ICLR 2025arXiv
18
citations
Learning from Reward-Free Offline Data: A Case for Planning with Latent Dynamics Models
NeurIPS 2025arXiv
18
citations
UltraVideo: High-Quality UHD Video Dataset with Comprehensive Captions
NeurIPS 2025arXiv
18
citations
Raindrop Clarity: A Dual-Focused Dataset for Day and Night Raindrop Removal
ECCV 2024arXiv
17
citations
TP2O: Creative Text Pair-to-Object Generation using Balance Swap-Sampling
ECCV 2024arXiv
16
citations
MetaOOD: Automatic Selection of OOD Detection Models
ICLR 2025arXiv
16
citations
LeVo: High-Quality Song Generation with Multi-Preference Alignment
NeurIPS 2025arXiv
15
citations
RoboScape: Physics-informed Embodied World Model
NeurIPS 2025arXiv
15
citations
Spiking Vision Transformer with Saccadic Attention
ICLR 2025arXiv
15
citations
ThinkBench: Dynamic Out-of-Distribution Evaluation for Robust LLM Reasoning
NeurIPS 2025arXiv
14
citations
GRIDS: Grouped Multiple-Degradation Restoration with Image Degradation Similarity
ECCV 2024arXiv
14
citations
Sharp Analysis for KL-Regularized Contextual Bandits and RLHF
NeurIPS 2025arXiv
14
citations
MoVideo: Motion-Aware Video Generation with Diffusion Models
ECCV 2024arXiv
14
citations
Quantized Spike-driven Transformer
ICLR 2025arXiv
14
citations
NFIG: Multi-Scale Autoregressive Image Generation via Frequency Ordering
NeurIPS 2025arXiv
14
citations
ATLAS: Autoformalizing Theorems through Lifting, Augmentation, and Synthesis of Data
NeurIPS 2025arXiv
13
citations
MedAgentBoard: Benchmarking Multi-Agent Collaboration with Conventional Methods for Diverse Medical Tasks
NeurIPS 2025arXiv
13
citations
UFM: A Simple Path towards Unified Dense Correspondence with Flow
NeurIPS 2025arXiv
13
citations
Convergence of Score-Based Discrete Diffusion Models: A Discrete-Time Analysis
ICLR 2025arXiv
13
citations
Learning Video Context as Interleaved Multimodal Sequences
ECCV 2024arXiv
12
citations
Adaptive Classifier-Free Guidance via Dynamic Low-Confidence Masking
NeurIPS 2025arXiv
12
citations
SINDER: Repairing the Singular Defects of DINOv2
ECCV 2024arXiv
12
citations
Stable Segment Anything Model
ICLR 2025arXiv
12
citations
SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction
ICLR 2025arXiv
12
citations
Mixture of Efficient Diffusion Experts Through Automatic Interval and Sub-Network Selection
ECCV 2024arXiv
12
citations
CLAP: Isolating Content from Style through Contrastive Learning with Augmented Prompts
ECCV 2024arXiv
11
citations
GeoLLaVA-8K: Scaling Remote-Sensing Multimodal Large Language Models to 8K Resolution
NeurIPS 2025arXiv
11
citations
LaGeM: A Large Geometry Model for 3D Representation Learning and Diffusion
ICLR 2025arXiv
11
citations
Monocular Occupancy Prediction for Scalable Indoor Scenes
ECCV 2024arXiv
11
citations
OneVOS: Unifying Video Object Segmentation with All-in-One Transformer Framework
ECCV 2024arXiv
11
citations
Optimizing Chain-of-Thought Reasoners via Gradient Variance Minimization in Rejection Sampling and RL
NeurIPS 2025arXiv
11
citations
DRIFT: Dynamic Rule-Based Defense with Injection Isolation for Securing LLM Agents
NeurIPS 2025arXiv
10
citations
MAGR: Manifold-Aligned Graph Regularization for Continual Action Quality Assessment
ECCV 2024arXiv
10
citations
Anyprefer: An Agentic Framework for Preference Data Synthesis
ICLR 2025arXiv
10
citations
Almost Optimal Batch-Regret Tradeoff for Batch Linear Contextual Bandits
ICLR 2025arXiv
10
citations
Few-shot NeRF by Adaptive Rendering Loss Regularization
ECCV 2024arXiv
10
citations
Overcoming False Illusions in Real-World Face Restoration with Multi-Modal Guided Diffusion Model
ICLR 2025arXiv
10
citations
RealMath: A Continuous Benchmark for Evaluating Language Models on Research-Level Mathematics
NeurIPS 2025arXiv
10
citations
NOVUM: Neural Object Volumes for Robust Object Classification
ECCV 2024
10
citations
BountyBench: Dollar Impact of AI Agent Attackers and Defenders on Real-World Cybersecurity Systems
NeurIPS 2025arXiv
9
citations
GenView: Enhancing View Quality with Pretrained Generative Model for Self-Supervised Learning
ECCV 2024arXiv
9
citations
PiTe: Pixel-Temporal Alignment for Large Video-Language Model
ECCV 2024arXiv
9
citations
Rethinking Video Deblurring with Wavelet-Aware Dynamic Transformer and Diffusion Model
ECCV 2024arXiv
9
citations
AgentTTS: Large Language Model Agent for Test-time Compute-optimal Scaling Strategy in Complex Tasks
NeurIPS 2025arXiv
9
citations
Test-time Adaptation for Cross-modal Retrieval with Query Shift
ICLR 2025arXiv
9
citations
SAMRefiner: Taming Segment Anything Model for Universal Mask Refinement
ICLR 2025arXiv
9
citations
Perplexity Trap: PLM-Based Retrievers Overrate Low Perplexity Documents
ICLR 2025arXiv
9
citations
HaDeMiF: Hallucination Detection and Mitigation in Large Language Models
ICLR 2025
9
citations
Diff3DS: Generating View-Consistent 3D Sketch via Differentiable Curve Rendering
ICLR 2025arXiv
9
citations
Vision Foundation Models as Effective Visual Tokenizers for Autoregressive Generation
NeurIPS 2025
8
citations
Causally Motivated Sycophancy Mitigation for Large Language Models
ICLR 2025
8
citations
Parameterized Quasi-Physical Simulators for Dexterous Manipulations Transfer
ECCV 2024arXiv
8
citations
Bidirectional Stereo Image Compression with Cross-Dimensional Entropy Model
ECCV 2024arXiv
8
citations
PoseCrafter: One-Shot Personalized Video Synthesis Following Flexible Pose Control
ECCV 2024arXiv
8
citations
CPathAgent: An Agent-based Foundation Model for Interpretable High-Resolution Pathology Image Analysis Mimicking Pathologists' Diagnostic Logic
NeurIPS 2025arXiv
8
citations
OneTrack: Demystifying the Conflict Between Detection and Tracking in End-to-End 3D Trackers
ECCV 2024
7
citations
What Has Been Overlooked in Contrastive Source-Free Domain Adaptation: Leveraging Source-Informed Latent Augmentation within Neighborhood Context
ICLR 2025arXiv
7
citations
Learning Cross-hand Policies of High-DOF Reaching and Grasping
ECCV 2024arXiv
7
citations
A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegrained Image Generation
ICLR 2025arXiv
7
citations
Poison-splat: Computation Cost Attack on 3D Gaussian Splatting
ICLR 2025arXiv
7
citations
IPDreamer: Appearance-Controllable 3D Object Generation with Complex Image Prompts
ICLR 2025arXiv
7
citations
Dissolving Is Amplifying: Towards Fine-Grained Anomaly Detection
ECCV 2024arXiv
7
citations
EA-VTR: Event-Aware Video-Text Retrieval
ECCV 2024arXiv
7
citations
Segment, Lift and Fit: Automatic 3D Shape Labeling from 2D Prompts
ECCV 2024arXiv
6
citations
Interleaving One-Class and Weakly-Supervised Models with Adaptive Thresholding for Unsupervised Video Anomaly Detection
ECCV 2024arXiv
6
citations
Integrative Decoding: Improving Factuality via Implicit Self-consistency
ICLR 2025arXiv
6
citations
LLMCO4MR: LLMs-aided Neural Combinatorial Optimization for Ancient Manuscript Restoration from Fragments with Case Studies on Dunhuang
ECCV 2024
6
citations
MMSearch: Unveiling the Potential of Large Models as Multi-modal Search Engines
ICLR 2025
6
citations
GlycanML: A Multi-Task and Multi-Structure Benchmark for Glycan Machine Learning
ICLR 2025arXiv
6
citations
DECOLLAGE: 3D Detailization by Controllable, Localized, and Learned Geometry Enhancement
ECCV 2024arXiv
6
citations
Leveraging Submodule Linearity Enhances Task Arithmetic Performance in LLMs
ICLR 2025arXiv
6
citations
Occlusion-Aware Seamless Segmentation
ECCV 2024arXiv
6
citations
ELICIT: LLM Augmentation Via External In-context Capability
ICLR 2025arXiv
6
citations
MetaMind: Modeling Human Social Thoughts with Metacognitive Multi-Agent Systems
NeurIPS 2025arXiv
6
citations
Exploring Reliable Matching with Phase Enhancement for Night-time Semantic Segmentation
ECCV 2024arXiv
5
citations
DeblurDiff: Real-Word Image Deblurring with Generative Diffusion Models
NeurIPS 2025
5
citations
DiSK: Differentially Private Optimizer with Simplified Kalman Filter for Noise Reduction
ICLR 2025arXiv
5
citations
SysBench: Can LLMs Follow System Message?
ICLR 2025
5
citations
SMI-Editor: Edit-based SMILES Language Model with Fragment-level Supervision
ICLR 2025arXiv
5
citations
Hessian-Free Online Certified Unlearning
ICLR 2025arXiv
5
citations
When Selection Meets Intervention: Additional Complexities in Causal Discovery
ICLR 2025arXiv
5
citations
Learning Graph Invariance by Harnessing Spuriosity
ICLR 2025
5
citations
On the Value of Cross-Modal Misalignment in Multimodal Representation Learning
NeurIPS 2025arXiv
5
citations
Correspondence-Free SE(3) Point Cloud Registration in RKHS via Unsupervised Equivariant Learning
ECCV 2024arXiv
5
citations
Thinking in Character: Advancing Role-Playing Agents with Role-Aware Reasoning
NeurIPS 2025arXiv
5
citations
RTV-Bench: Benchmarking MLLM Continuous Perception, Understanding and Reasoning through Real-Time Video
NeurIPS 2025arXiv
5
citations
Learning the Unlearned: Mitigating Feature Suppression in Contrastive Learning
ECCV 2024arXiv
5
citations
Adaptive Multi-modal Fusion of Spatially Variant Kernel Refinement with Diffusion Model for Blind Image Super-Resolution
ECCV 2024arXiv
5
citations
RaFE: Generative Radiance Fields Restoration
ECCV 2024arXiv
5
citations
BigDocs: An Open Dataset for Training Multimodal Models on Document and Code Tasks
ICLR 2025arXiv
5
citations
Estimation and Inference in Distributional Reinforcement Learning
NeurIPS 2025arXiv
4
citations
Dynamic Risk Assessments for Offensive Cybersecurity Agents
NeurIPS 2025arXiv
4
citations
SymmetricDiffusers: Learning Discrete Diffusion on Finite Symmetric Groups
ICLR 2025arXiv
4
citations
Test-time Model Adaptation for Image Reconstruction Using Self-supervised Adaptive Layers
ECCV 2024
4
citations
Noisy Test-Time Adaptation in Vision-Language Models
ICLR 2025arXiv
4
citations
CellVerse: Do Large Language Models Really Understand Cell Biology?
NeurIPS 2025arXiv
4
citations
Event-Based Motion Magnification
ECCV 2024arXiv
4
citations
AlgoTune: Can Language Models Speed Up General-Purpose Numerical Programs?
NeurIPS 2025arXiv
4
citations
Fact-R1: Towards Explainable Video Misinformation Detection with Deep Reasoning
NeurIPS 2025arXiv
4
citations
Hot-pluggable Federated Learning: Bridging General and Personalized FL via Dynamic Selection
ICLR 2025
4
citations
Unveiling the Learning Mind of Language Models: A Cognitive Framework and Empirical Study
NeurIPS 2025arXiv
3
citations
CrossAD: Time Series Anomaly Detection with Cross-scale Associations and Cross-window Modeling
NeurIPS 2025arXiv
3
citations
Watch and Listen: Understanding Audio-Visual-Speech Moments with Multimodal LLM
NeurIPS 2025arXiv
3
citations
RLZero: Direct Policy Inference from Language Without In-Domain Supervision
NeurIPS 2025arXiv
3
citations
MEgoHand: Multimodal Egocentric Hand-Object Interaction Motion Generation
NeurIPS 2025arXiv
3
citations
Memory Mosaics at scale
NeurIPS 2025arXiv
3
citations
STRAP: Spatio-Temporal Pattern Retrieval for Out-of-Distribution Generalization
NeurIPS 2025arXiv
3
citations
VFRTok: Variable Frame Rates Video Tokenizer with Duration-Proportional Information Assumption
NeurIPS 2025arXiv
3
citations
TC-Light: Temporally Coherent Generative Rendering for Realistic World Transfer
NeurIPS 2025arXiv
3
citations
Optimal Control for Transformer Architectures: Enhancing Generalization, Robustness and Efficiency
NeurIPS 2025arXiv
3
citations
Beyond Modality Collapse: Representation Blending for Multimodal Dataset Distillation
NeurIPS 2025arXiv
3
citations
Homomorphism Expressivity of Spectral Invariant Graph Neural Networks
ICLR 2025arXiv
3
citations
Deep Feature Surgery: Towards Accurate and Efficient Multi-Exit Networks
ECCV 2024arXiv
3
citations
Next Semantic Scale Prediction via Hierarchical Diffusion Language Models
NeurIPS 2025
3
citations
Attention! Your Vision Language Model Could Be Maliciously Manipulated
NeurIPS 2025arXiv
3
citations
Efficient Speech Language Modeling via Energy Distance in Continuous Latent Space
NeurIPS 2025arXiv
3
citations
Neural-Driven Image Editing
NeurIPS 2025arXiv
2
citations
Training-Free Bayesianization for Low-Rank Adapters of Large Language Models
NeurIPS 2025arXiv
2
citations
MVU-Eval: Towards Multi-Video Understanding Evaluation for Multimodal LLMs
NeurIPS 2025arXiv
2
citations
RFMamba: Frequency-Aware State Space Model for RF-Based Human-Centric Perception
ICLR 2025
2
citations
An Optimal Discriminator Weighted Imitation Perspective for Reinforcement Learning
ICLR 2025arXiv
2
citations
Test-time Adaptation for Image Compression with Distribution Regularization
ICLR 2025arXiv
2
citations
Hierachical Balance Packing: Towards Efficient Supervised Fine-tuning for Long-Context LLM
NeurIPS 2025
2
citations
Interference Among First-Price Pacing Equilibria: A Bias and Variance Analysis
ICLR 2025arXiv
2
citations
Provable Ordering and Continuity in Vision-Language Pretraining for Generalizable Embodied Agents
NeurIPS 2025arXiv
2
citations
BenchmarkCards: Standardized Documentation for Large Language Model Benchmarks
NeurIPS 2025arXiv
2
citations
Alignment of Large Language Models with Constrained Learning
NeurIPS 2025arXiv
2
citations
OAT: Object-Level Attention Transformer for Gaze Scanpath Prediction
ECCV 2024arXiv
2
citations
Dysca: A Dynamic and Scalable Benchmark for Evaluating Perception Ability of LVLMs
ICLR 2025arXiv
2
citations
S'MoRE: Structural Mixture of Residual Experts for Parameter-Efficient LLM Fine-tuning
NeurIPS 2025arXiv
2
citations
See through the Dark: Learning Illumination-affined Representations for Nighttime Occupancy Prediction
NeurIPS 2025arXiv
2
citations
A Conditional Independence Test in the Presence of Discretization
ICLR 2025arXiv
2
citations
PRING: Rethinking Protein-Protein Interaction Prediction from Pairs to Graphs
NeurIPS 2025arXiv
2
citations
Generative Graph Pattern Machine
NeurIPS 2025arXiv
2
citations
UNIP: Rethinking Pre-trained Attention Patterns for Infrared Semantic Segmentation
ICLR 2025arXiv
2
citations
RESAnything: Attribute Prompting for Arbitrary Referring Segmentation
NeurIPS 2025arXiv
2
citations
MonoTTA: Fully Test-Time Adaptation for Monocular 3D Object Detection
ECCV 2024
2
citations
One Filters All: A Generalist Filter For State Estimation
NeurIPS 2025arXiv
2
citations
A Statistical Approach for Controlled Training Data Detection
ICLR 2025
2
citations
Causal Graph Transformer for Treatment Effect Estimation Under Unknown Interference
ICLR 2025
2
citations
Prioritizing Perception-Guided Self-Supervision: A New Paradigm for Causal Modeling in End-to-End Autonomous Driving
NeurIPS 2025arXiv
1
citations
Curious Causality-Seeking Agents Learn Meta Causal World
NeurIPS 2025arXiv
1
citations
Towards Provable Emergence of In-Context Reinforcement Learning
NeurIPS 2025arXiv
1
citations
Two‑Stage Learning of Stabilizing Neural Controllers via Zubov Sampling and Iterative Domain Expansion
NeurIPS 2025arXiv
1
citations
Handling Label Noise via Instance-Level Difficulty Modeling and Dynamic Optimization
NeurIPS 2025arXiv
1
citations
OverLayBench: A Benchmark for Layout-to-Image Generation with Dense Overlaps
NeurIPS 2025arXiv
1
citations
Bootstrap Off-policy with World Model
NeurIPS 2025arXiv
1
citations
UniGist: Towards General and Hardware-aligned Sequence-level Long Context Compression
NeurIPS 2025arXiv
1
citations
VimoRAG: Video-based Retrieval-augmented 3D Motion Generation for Motion Language Models
NeurIPS 2025arXiv
1
citations
Toward Efficient Inference Attacks: Shadow Model Sharing via Mixture-of-Experts
NeurIPS 2025arXiv
1
citations
ShortListing Model: A Streamlined Simplex Diffusion for Discrete Variable Generation
NeurIPS 2025
1
citations
Personalized Bayesian Federated Learning with Wasserstein Barycenter Aggregation
NeurIPS 2025arXiv
1
citations
Embracing Trustworthy Brain-Agent Collaboration as Paradigm Extension for Intelligent Assistive Technologies
NeurIPS 2025arXiv
1
citations
Online Segment Any 3D Thing as Instance Tracking
NeurIPS 2025arXiv
1
citations
Creativity or Brute Force? Using Brainteasers as a Window into the Problem-Solving Abilities of Large Language Models
NeurIPS 2025arXiv
1
citations
Fast Data Attribution for Text-to-Image Models
NeurIPS 2025arXiv
1
citations
Can LLMs Reason Over Non-Text Modalities in a Training-Free Manner? A Case Study with In-Context Representation Learning
NeurIPS 2025arXiv
1
citations
Adaptive Stochastic Coefficients for Accelerating Diffusion Sampling
NeurIPS 2025arXiv
1
citations
Minimax Optimal Two-Stage Algorithm For Moment Estimation Under Covariate Shift
ICLR 2025arXiv
1
citations
DecoyDB: A Dataset for Graph Contrastive Learning in Protein-Ligand Binding Affinity Prediction
NeurIPS 2025arXiv
1
citations
Controlled LLM Decoding via Discrete Auto-regressive Biasing
ICLR 2025arXiv
1
citations
PolyhedronNet: Representation Learning for Polyhedra with Surface-attributed Graph
ICLR 2025arXiv
1
citations
Prompt as Knowledge Bank: Boost Vision-language model via Structural Representation for zero-shot medical detection
ICLR 2025arXiv
1
citations
When narrower is better: the narrow width limit of Bayesian parallel branching neural networks
ICLR 2025arXiv
1
citations
BaSIC: BayesNet Structure Learning for Computational Scalable Neural Image Compression
ECCV 2024
1
citations
MGCFNN: A Neural MultiGrid Solver with Novel Fourier Neural Network for High Wave Number Helmholtz Equations
ICLR 2025
1
citations
Debiasing Federated Learning with Correlated Client Participation
ICLR 2025arXiv
1
citations
GeoILP: A Synthetic Dataset to Guide Large-Scale Rule Induction
ICLR 2025
1
citations
Release the Powers of Prompt Tuning: Cross-Modality Prompt Transfer
ICLR 2025
1
citations
Dependency-aware Differentiable Neural Architecture Search
ECCV 2024
1
citations
A Robust Method to Discover Causal or Anticausal Relation
ICLR 2025
1
citations
Exploring Polyglot Harmony: On Multilingual Data Allocation for Large Language Models Pretraining
NeurIPS 2025arXiv
1
citations
Bi-Level Decision-Focused Causal Learning for Large-Scale Marketing Optimization: Bridging Observational and Experimental Data
NeurIPS 2025arXiv
0
citations
Model-Guided Dual-Role Alignment for High-Fidelity Open-Domain Video-to-Audio Generation
NeurIPS 2025arXiv
0
citations
The Primacy of Magnitude in Low-Rank Adaptation
NeurIPS 2025arXiv
0
citations
NeuroPath: Neurobiology-Inspired Path Tracking and Reflection for Semantically Coherent Retrieval
NeurIPS 2025arXiv
0
citations
FedGPS: Statistical Rectification Against Data Heterogeneity in Federated Learning
NeurIPS 2025arXiv
0
citations
Switchable Token-Specific Codebook Quantization For Face Image Compression
NeurIPS 2025arXiv
0
citations
Off-policy Reinforcement Learning with Model-based Exploration Augmentation
NeurIPS 2025arXiv
0
citations
UniRestore3D: A Scalable Framework For General Shape Restoration
ICLR 2025
0
citations
Nemotron-Flash: Towards Latency-Optimal Hybrid Small Language Models
NeurIPS 2025arXiv
0
citations
Semantic Temporal Abstraction via Vision-Language Model Guidance for Efficient Reinforcement Learning
ICLR 2025
0
citations
PID-controlled Langevin Dynamics for Faster Sampling on Generative Models
NeurIPS 2025arXiv
0
citations
KINDLE: Knowledge-Guided Distillation for Prior-Free Gene Regulatory Network Inference
NeurIPS 2025arXiv
0
citations
ThermalGen: Style-Disentangled Flow-Based Generative Models for RGB-to-Thermal Image Translation
NeurIPS 2025arXiv
0
citations
DepthVanish: Optimizing Adversarial Interval Structures for Stereo-Depth-Invisible Patches
NeurIPS 2025arXiv
0
citations
Probing Neural Combinatorial Optimization Models
NeurIPS 2025arXiv
0
citations
Stop DDoS Attacking the Research Community with AI-Generated Survey Papers
NeurIPS 2025arXiv
0
citations
Novel Class Discovery for Point Cloud Segmentation via Joint Learning of Causal Representation and Reasoning
NeurIPS 2025arXiv
0
citations
Aligning by Misaligning: Boundary-aware Curriculum Learning for Multimodal Alignment
NeurIPS 2025arXiv
0
citations
Each Complexity Deserves a Pruning Policy
NeurIPS 2025arXiv
0
citations
Order-Level Attention Similarity Across Language Models: A Latent Commonality
NeurIPS 2025arXiv
0
citations
StyleGuard: Preventing Text-to-Image-Model-based Style Mimicry Attacks by Style Perturbations
NeurIPS 2025arXiv
0
citations
On the Stability of Graph Convolutional Neural Networks: A Probabilistic Perspective
NeurIPS 2025arXiv
0
citations
Fewer May Be Better: Enhancing Offline Reinforcement Learning with Reduced Dataset
ICLR 2025arXiv
0
citations
Flexible Realignment of Language Models
NeurIPS 2025arXiv
0
citations
AANet: Virtual Screening under Structural Uncertainty via Alignment and Aggregation
NeurIPS 2025arXiv
0
citations
RAG-IGBench: Innovative Evaluation for RAG-based Interleaved Generation in Open-domain Question Answering
NeurIPS 2025arXiv
0
citations
EGGS: Exchangeable 2D/3D Gaussian Splatting for Geometry-Appearance Balanced Novel View Synthesis
NeurIPS 2025arXiv
0
citations
FRBNet: Revisiting Low-Light Vision through Frequency-Domain Radial Basis Network
NeurIPS 2025arXiv
0
citations
OmniFC: Rethinking Federated Clustering via Lossless and Secure Distance Reconstruction
NeurIPS 2025arXiv
0
citations
Multimodal 3D Genome Pre-training
NeurIPS 2025arXiv
0
citations
MuSLR: Multimodal Symbolic Logical Reasoning
NeurIPS 2025arXiv
0
citations
Faithful Group Shapley Value
NeurIPS 2025arXiv
0
citations
F-Adapter: Frequency-Adaptive Parameter-Efficient Fine-Tuning in Scientific Machine Learning
NeurIPS 2025arXiv
0
citations
Variational Task Vector Composition
NeurIPS 2025arXiv
0
citations
mmWalk: Towards Multi-modal Multi-view Walking Assistance
NeurIPS 2025arXiv
0
citations
CGBench: Benchmarking Language Model Scientific Reasoning for Clinical Genetics Research
NeurIPS 2025arXiv
0
citations
Improving Model Representation and Reducing KV Cache via Skip Connections with First Value Heads
NeurIPS 2025arXiv
0
citations
ScatterAD: Temporal-Topological Scattering Mechanism for Time Series Anomaly Detection
NeurIPS 2025arXiv
0
citations
Rethinking Hebbian Principle: Low-Dimensional Structural Projection for Unsupervised Learning
NeurIPS 2025arXiv
0
citations
Dynamic Gaussian Splatting from Defocused and Motion-blurred Monocular Videos
NeurIPS 2025arXiv
0
citations