Most Cited 2025 "differentiable tree search" Papers
22,274 papers found • Page 20 of 112
Conference
EchoShot: Multi-Shot Portrait Video Generation
Jiahao Wang, Hualian Sheng, Sijia Cai et al.
Effective and Efficient Time-Varying Counterfactual Prediction with State-Space Models
Haotian Wang, Haoxuan Li, Hao Zou et al.
Towards Real Unsupervised Anomaly Detection Via Confident Meta-Learning
Muhammad Aqeel, Shakiba Sharifi, Marco Cristani et al.
SymDPO: Boosting In-Context Learning of Large Multimodal Models with Symbol Demonstration Direct Preference Optimization
Hongrui Jia, Chaoya Jiang, Haiyang Xu et al.
Doubly Robust Conformalized Survival Analysis with Right-Censored Data
Matteo Sesia, vladimir svetnik
BRAID: Input-driven Nonlinear Dynamical Modeling of Neural-Behavioral Data
Parsa Vahidi, Omid G. Sani, Maryam Shanechi
MimeQA: Towards Socially-Intelligent Nonverbal Foundation Models
Hengzhi Li, Megan Tjandrasuwita, Yi R. (May) Fung et al.
Understanding Fairness Surrogate Functions in Algorithmic Fairness
Yong Liu, (Andrew) Zhanke Zhou, Zhicong Li et al.
CoreGuard: Safeguarding Foundational Capabilities of LLMs Against Model Stealing in Edge Deployment
Qinfeng Li, Tianyue Luo, Xuhong Zhang et al.
Glauber Generative Model: Discrete Diffusion Models via Binary Classification
Harshit Varma, Dheeraj Nagaraj, Karthikeyan Shanmugam
Alignment-Free RGB-T Salient Object Detection: A Large-Scale Dataset and Progressive Correlation Network
Kunpeng Wang, Keke Chen, Chenglong Li et al.
Janus-Pro-R1: Advancing Collaborative Visual Comprehension and Generation via Reinforcement Learning
Kaihang Pan, Yang Wu, Wendong Bu et al.
DGQ: Distribution-Aware Group Quantization for Text-to-Image Diffusion Models
Hyogon Ryu, NaHyeon Park, Hyunjung Shim
SemGeoMo: Dynamic Contextual Human Motion Generation with Semantic and Geometric Guidance
Peishan Cong, Ziyi Wang, Yuexin Ma et al.
FAM Diffusion: Frequency and Attention Modulation for High-Resolution Image Generation with Stable Diffusion
Haosen Yang, Adrian Bulat, Isma Hadji et al.
SMITE: Segment Me In TimE
Amirhossein Alimohammadi, Sauradip Nag, Saeid Asgari et al.
Federated Continual Instruction Tuning
Haiyang Guo, Fanhu Zeng, Fei Zhu et al.
Learning Safety Constraints for Large Language Models
Xin Chen, Yarden As, Andreas Krause
Anywhere: A Multi-Agent Framework for User-Guided, Reliable, and Diverse Foreground-Conditioned Image Generation
Xie Tianyidan, Rui Ma, Qian Wang et al.
Beyond Human Data: Aligning Multimodal Large Language Models by Iterative Self-Evolution
Wentao Tan, Qiong Cao, Yibing Zhan et al.
Cross-Domain Graph Data Scaling: A Showcase with Diffusion Models
Wenzhuo Tang, Haitao Mao, Danial Dervovic et al.
Slot-Guided Adaptation of Pre-trained Diffusion Models for Object-Centric Learning and Compositional Generation
adil kaan akan, Yucel Yemez
What Do Latent Action Models Actually Learn?
Chuheng Zhang, Tim Pearce, Pushi Zhang et al.
Sculpting Features from Noise: Reward-Guided Hierarchical Diffusion for Task-Optimal Feature Transformation
Nanxu Gong, Zijun Li, Sixun Dong et al.
TIMotion: Temporal and Interactive Framework for Efficient Human-Human Motion Generation
Yabiao Wang, Shuo Wang, Jiangning Zhang et al.
Impossible Videos
Zechen Bai, Hai Ci, Mike Zheng Shou
InterAct: Advancing Large-Scale Versatile 3D Human-Object Interaction Generation
Sirui Xu, Dongting Li, Yucheng Zhang et al.
Privacy amplification by random allocation
Moshe Shenfeld, Vitaly Feldman
MMReason: An Open-Ended Multi-Modal Multi-Step Reasoning Benchmark for MLLMs Toward AGI
Huanjin Yao, Jiaxing Huang, Yawen Qiu et al.
Medical Multimodal Model Stealing Attacks via Adversarial Domain Alignment
Yaling Shen, Zhixiong Zhuang, Kun Yuan et al.
STPro: Spatial and Temporal Progressive Learning for Weakly Supervised Spatio-Temporal Grounding
Aaryan Garg, Akash Kumar, Yogesh S. Rawat
Compression via Pre-trained Transformers: A Study on Byte-Level Multimodal Data
David Heurtel-Depeiges, Anian Ruoss, Joel Veness et al.
Arbitrary Reading Order Scene Text Spotter with Local Semantics Guidance
Jiahao Lyu, Wei Wang, Dongbao Yang et al.
SwiftTry: Fast and Consistent Video Virtual Try-On with Diffusion Models
Hung Nguyen, Quang Qui-Vinh Nguyen, Khoi Nguyen et al.
GenesisTex2: Stable, Consistent and High-Quality Text-to-Texture Generation
Jiawei Lu, YingPeng Zhang, Zengjun Zhao et al.
GraphLand: Evaluating Graph Machine Learning Models on Diverse Industrial Data
Gleb Bazhenov, Oleg Platonov, Liudmila Prokhorenkova
Training Language Models on Synthetic Edit Sequences Improves Code Synthesis
Ulyana Piterbarg, Lerrel Pinto, Rob Fergus
Motion-aware Contrastive Learning for Temporal Panoptic Scene Graph Generation
Thong Thanh Nguyen, Xiaobao Wu, Yi Bin et al.
AtomSurf: Surface Representation for Learning on Protein Structures
Vincent Mallet, Yangyang Miao, Souhaib Attaiki et al.
Scaling Laws for Task-Optimized Models of the Primate Visual Ventral Stream
Abdulkadir Gokce, Martin Schrimpf
PBR-NeRF: Inverse Rendering with Physics-Based Neural Fields
Sean Wu, Shamik Basu, Tim Broedermann et al.
Neighborhood Self-Dissimilarity Attention for Medical Image Segmentation
Junren Chen, Rui Chen, Wei Wang et al.
Loss Functions and Operators Generated by f-Divergences
Vincent Roulet, Tianlin Liu, Nino Vieillard et al.
ConMo: Controllable Motion Disentanglement and Recomposition for Zero-Shot Motion Transfer
Jiayi Gao, Zijin Yin, Changcheng Hua et al.
Symbolic regression via MDLformer-guided search: from minimizing prediction error to minimizing description length
Zihan Yu, Jingtao Ding, Yong Li et al.
FreSh: Frequency Shifting for Accelerated Neural Representation Learning
Adam Kania, Marko Mihajlovic, Sergey Prokudin et al.
Validating LLM-as-a-Judge Systems under Rating Indeterminacy
Luke Guerdan, Solon Barocas, Kenneth Holstein et al.
System-1.5 Reasoning: Traversal in Language and Latent Spaces with Dynamic Shortcuts
Xiaoqiang Wang, Suyuchen Wang, Yun Zhu et al.
Ultra-Resolution Adaptation with Ease
Ruonan Yu, Songhua Liu, Zhenxiong Tan et al.
Segment Any 3D Object with Language
Seungjun Lee, Yuyang Zhao, Gim H Lee
Robust Multimodal Survival Prediction with Conditional Latent Differentiation Variational AutoEncoder
Junjie Zhou, Jiao Tang, Yingli Zuo et al.
Spatial Understanding from Videos: Structured Prompts Meet Simulation Data
Haoyu Zhang, Meng Liu, Zaijing Li et al.
Object-Shot Enhanced Grounding Network for Egocentric Video
Yisen Feng, Haoyu Zhang, Meng Liu et al.
SEMU: Singular Value Decomposition for Efficient Machine Unlearning
Marcin Sendera, Łukasz Struski, Kamil Książek et al.
Uncertainty Modeling in Graph Neural Networks via Stochastic Differential Equations
Richard Bergna, Sergio Calvo Ordoñez, Felix Opolka et al.
Multimodal LLMs as Customized Reward Models for Text-to-Image Generation
Shijie Zhou, Ruiyi Zhang, Huaisheng Zhu et al.
PIG: Physics-Informed Gaussians as Adaptive Parametric Mesh Representations
Namgyu Kang, Jaemin Oh, Youngjoon Hong et al.
Unleashing In-context Learning of Autoregressive Models for Few-shot Image Manipulation
Bolin Lai, Felix Juefei-Xu, Miao Liu et al.
COSMOS: Cross-Modality Self-Distillation for Vision Language Pre-training
Sanghwan Kim, Rui Xiao, Iuliana Georgescu et al.
MotionAgent: Fine-grained Controllable Video Generation via Motion Field Agent
Xinyao Liao, Xianfang Zeng, Liao Wang et al.
Synergizing Motion and Appearance: Multi-Scale Compensatory Codebooks for Talking Head Video Generation
Shuling Zhao, Fa-Ting Hong, Xiaoshui Huang et al.
Conformal Linguistic Calibration: Trading-off between Factuality and Specificity
Zhengping Jiang, Anqi Liu, Ben Van Durme
StreamGS: Online Generalizable Gaussian Splatting Reconstruction for Unposed Image Streams
Yang LI, Jinglu Wang, Lei Chu et al.
ManiVideo: Generating Hand-Object Manipulation Video with Dexterous and Generalizable Grasping
Youxin Pang, Ruizhi Shao, Jiajun Zhang et al.
COLUMBUS: Evaluating COgnitive Lateral Understanding Through Multiple-Choice reBUSes
Koen Kraaijveld, Yifan Jiang, Kaixin Ma et al.
Overcoming the Curse of Dimensionality in Reinforcement Learning Through Approximate Factorization
Chenbei Lu, Laixi Shi, Zaiwei Chen et al.
Second Order Bounds for Contextual Bandits with Function Approximation
Aldo Pacchiano
A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegrained Image Generation
Liang Chen, Sinan Tan, Zefan Cai et al.
Doubly Contrastive Learning for Source-Free Domain Adaptive Person Search
Yizhen Jia, Rong Quan, Yue Feng et al.
ESE: Espresso Sentence Embeddings
Xianming Li, Zongxi Li, Jing Li et al.
Leveraging Skills from Unlabeled Prior Data for Efficient Online Exploration
Max Wilcoxson, Qiyang Li, Kevin Frans et al.
AutoElicit: Using Large Language Models for Expert Prior Elicitation in Predictive Modelling
Alexander Capstick, Rahul G. Krishnan, Payam Barnaghi
TruthPrInt: Mitigating Large Vision-Language Models Object Hallucination Via Latent Truthful-Guided Pre-Intervention
Jinhao Duan, Fei Kong, Hao Cheng et al.
Geometric Knowledge-Guided Localized Global Distribution Alignment for Federated Learning
Yanbiao Ma, Wei Dai, Wenke Huang et al.
KinMo: Kinematic-aware Human Motion Understanding and Generation
Pengfei Zhang, Pinxin Liu, Pablo Garrido et al.
M3amba: Memory Mamba is All You Need for Whole Slide Image Classification
Tingting Zheng, Kui Jiang, Yi Xiao et al.
A multiscale analysis of mean-field transformers in the moderate interaction regime
Giuseppe Bruno, Federico Pasqualotto, Andrea Agazzi
SliderSpace: Decomposing the Visual Capabilities of Diffusion Models
Rohit Gandikota, Zongze Wu, Richard Zhang et al.
Building a Mind Palace: Structuring Environment-Grounded Semantic Graphs for Effective Long Video Analysis with LLMs
Zeyi Huang, Yuyang Ji, Xiaofang Wang et al.
DuMo: Dual Encoder Modulation Network for Precise Concept Erasure
Feng Han, Kai Chen, Chao Gong et al.
Emergence and Evolution of Interpretable Concepts in Diffusion Models
Berk Tinaz, Zalan Fabian, Mahdi Soltanolkotabi
MI-DETR: An Object Detection Model with Multi-time Inquiries Mechanism
Zhixiong Nan, Xianghong Li, Tao Xiang et al.
Pursuing Feature Separation based on Neural Collapse for Out-of-Distribution Detection
Yingwen Wu, Ruiji Yu, Xinwen Cheng et al.
A General Adaptive Dual-level Weighting Mechanism for Remote Sensing Pansharpening
Jie Huang, Haorui Chen, Jiaxuan Ren et al.
Position: We Need An Algorithmic Understanding of Generative AI
Oliver Eberle, Thomas McGee, Hamza Giaffar et al.
PALMBENCH: A COMPREHENSIVE BENCHMARK OF COMPRESSED LARGE LANGUAGE MODELS ON MOBILE PLATFORMS
Yilong Li, Jingyu Liu, Hao Zhang et al.
HQGS: High-Quality Novel View Synthesis with Gaussian Splatting in Degraded Scenes
Xin Lin, Shi Luo, Xiaojun Shan et al.
Perception in Reflection
Yana Wei, Liang Zhao, Kangheng Lin et al.
PQMass: Probabilistic Assessment of the Quality of Generative Models using Probability Mass Estimation
Pablo Lemos, Sammy Sharief, Nikolay Malkin et al.
Out of Length Text Recognition with Sub-String Matching
Yongkun Du, Zhineng Chen, Caiyan Jia et al.
AVF-MAE++: Scaling Affective Video Facial Masked Autoencoders via Efficient Audio-Visual Self-Supervised Learning
Xuecheng Wu, Heli Sun, Yifan Wang et al.
DISCO: learning to DISCover an evolution Operator for multi-physics-agnostic prediction
Rudy Morel, Jiequn Han, Edouard Oyallon
GS-LIVM: Real-Time Photo-Realistic LiDAR-Inertial-Visual Mapping with Gaussian Splatting
Yusen XIE, Zhenmin Huang, Jin Wu et al.
Generating Freeform Endoskeletal Robots
Muhan Li, Lingji Kong, Sam Kriegman
Triples as the Key: Structuring Makes Decomposition and Verification Easier in LLM-based TableQA
Zhen Yang, Ziwei Du, Minghan Zhang et al.
SVGBuilder: Component-Based Colored SVG Generation with Text-Guided Autoregressive Transformers
Zehao Chen, Rong Pan
RobustMerge: Parameter-Efficient Model Merging for MLLMs with Direction Robustness
Fanhu Zeng, Haiyang Guo, Fei Zhu et al.
Keeping an Eye on LLM Unlearning: The Hidden Risk and Remedy
Jie Ren, Zhenwei Dai, Xianfeng Tang et al.
Training-Free Constrained Generation With Stable Diffusion Models
Stefano Zampini, Jacob K Christopher, Luca Oneto et al.
Evaluating Neuron Explanations: A Unified Framework with Sanity Checks
Tuomas Oikarinen, Ge Yan, Lily Weng
CausalRivers - Scaling up benchmarking of causal discovery for real-world time-series
Gideon Stein, Maha Shadaydeh, Jan Blunk et al.
Data-Juicer Sandbox: A Feedback-Driven Suite for Multimodal Data-Model Co-development
Daoyuan Chen, Haibin Wang, Yilun Huang et al.
PADRe: A Unifying Polynomial Attention Drop-in Replacement for Efficient Vision Transformer
Pierre-David Letourneau, Manish Singh, Hsin-Pai Cheng et al.
Parameter-efficient Fine-tuning in Hyperspherical Space for Open-vocabulary Semantic Segmentation
Zelin Peng, Zhengqin Xu, Zhilin Zeng et al.
Divot: Diffusion Powers Video Tokenizer for Comprehension and Generation
Yuying Ge, Yizhuo Li, Yixiao Ge et al.
Enhancing Privacy-Utility Trade-offs to Mitigate Memorization in Diffusion Models
Chen Chen, Daochang Liu, Mubarak Shah et al.
Multi-Perspective Data Augmentation for Few-shot Object Detection
Anh-Khoa Nguyen Vu, Quoc Truong Truong, Vinh-Tiep Nguyen et al.
Keyframe-oriented Vision Token Pruning: Enhancing Efficiency of Large Vision Language Models on Long-Form Video Processing
Yudong Liu, Jingwei Sun, Yueqian Lin et al.
Cross-Modal and Uncertainty-Aware Agglomeration for Open-Vocabulary 3D Scene Understanding
Jinlong Li, Cristiano Saltori, Fabio Poiesi et al.
ARIG: Autoregressive Interactive Head Generation for Real-time Conversations
Ying Guo, Xi Liu, Cheng Zhen et al.
Robustness Auditing for Linear Regression: To Singularity and Beyond
Ittai Rubinstein, Samuel Hopkins
CAM: A Constructivist View of Agentic Memory for LLM-Based Reading Comprehension
Rui Li, Zeyu Zhang, Xiaohe Bo et al.
NOVA: A Benchmark for Rare Anomaly Localization and Clinical Reasoning in Brain MRI
Cosmin Bercea, Jun Li, Philipp Raffler et al.
Scene Map-based Prompt Tuning for Navigation Instruction Generation
Sheng Fan, Rui Liu, Wenguan Wang et al.
Generalizable Sensor-Based Activity Recognition via Categorical Concept Invariant Learning
Di Xiong, Shuoyuan Wang, Lei Zhang et al.
Ringmaster ASGD: The First Asynchronous SGD with Optimal Time Complexity
Artavazd Maranjyan, Alexander Tyurin, Peter Richtarik
Value-Guided Search for Efficient Chain-of-Thought Reasoning
Kaiwen Wang, Jin Zhou, Jonathan Chang et al.
Learning Complex Heterogeneous Multimodal Fake News via Social Latent Network Inference
Mingxin Li, Yuchen Zhang, Haowei Xu et al.
Noisy Label Calibration for Multi-View Classification
Shilin Xu, Yuan Sun, Xingfeng Li et al.
ActionReasoningBench: Reasoning about Actions with and without Ramification Constraints
Divij Handa, Pavel Dolin, Shrinidhi Kumbhar et al.
HAIF-GS: Hierarchical and Induced Flow-Guided Gaussian Splatting for Dynamic Scene
Jianing Chen, Zehao Li, Yujun Cai et al.
Bigger, Regularized, Categorical: High-Capacity Value Functions are Efficient Multi-Task Learners
Michal Nauman, Marek Cygan, Carmelo Sferrazza et al.
NeSyC: A Neuro-symbolic Continual Learner For Complex Embodied Tasks in Open Domains
Wonje Choi, Jinwoo Park, Sanghyun Ahn et al.
Learning Fine-Grained Representations through Textual Token Disentanglement in Composed Video Retrieval
Yue Wu, Zhaobo Qi, Yiling Wu et al.
Latent Policy Barrier: Learning Robust Visuomotor Policies by Staying In-Distribution
Zhanyi Sun, Shuran Song
CTSyn: A Foundation Model for Cross Tabular Data Generation
Xiaofeng Lin, Chenheng Xu, Matthew Yang et al.
Beyond the Frame: Generating 360° Panoramic Videos from Perspective Videos
Rundong Luo, Matthew Wallingford, Ali Farhadi et al.
DAMO: Decoding by Accumulating Activations Momentum for Mitigating Hallucinations in Vision-Language Models
Kaishen Wang, Hengrui Gu, Meijun Gao et al.
Expensive Multi-Objective Bayesian Optimization Based on Diffusion Models
Bingdong Li, Zixiang Di, Yongfan Lu et al.
Detecting Visual Information Manipulation Attacks in Augmented Reality: A Multimodal Semantic Reasoning Approach
Yanming Xiu, Maria Gorlatova
Adaptive Calibration: A Unified Conversion Framework of Spiking Neural Networks
Ziqing Wang, Yuetong Fang, Jiahang Cao et al.
Open CaptchaWorld: A Comprehensive Web-based Platform for Testing and Benchmarking Multimodal LLM Agents
Yaxin Luo, Zhaoyi Li, Jiacheng Liu et al.
Directional Gradient Projection for Robust Fine-Tuning of Foundation Models
Chengyue Huang, Junjiao Tian, Brisa Maneechotesuwan et al.
Solving Robust Markov Decision Processes: Generic, Reliable, Efficient
Tobias Meggendorfer, Maximilian Weininger, Patrick Wienhöft
Robust and Conjugate Spatio-Temporal Gaussian Processes
William Laplante, Matias Altamirano, Andrew Duncan et al.
CL-LoRA: Continual Low-Rank Adaptation for Rehearsal-Free Class-Incremental Learning
Jiangpeng He, Zhihao Duan, Fengqing Zhu
LIRM: Large Inverse Rendering Model for Progressive Reconstruction of Shape, Materials and View-dependent Radiance Fields
Zhengqin Li, Dilin Wang, Ka chen et al.
Fine-structure Preserved Real-world Image Super-resolution via Transfer VAE Training
Qiaosi Yi, Shuai Li, Rongyuan Wu et al.
Lifelong Safety Alignment for Language Models
Haoyu Wang, Yifei Zhao, Zeyu Qin et al.
EEE-Bench: A Comprehensive Multimodal Electrical And Electronics Engineering Benchmark
Ming Li, Jike Zhong, Tianle Chen et al.
Beyond Verifiable Rewards: Scaling Reinforcement Learning in Language Models to Unverifiable Data
Yunhao Tang, Sid Wang, Lovish Madaan et al.
Integrative Decoding: Improving Factuality via Implicit Self-consistency
Yi Cheng, Xiao Liang, Yeyun Gong et al.
Federated Class-Incremental Learning: A Hybrid Approach Using Latent Exemplars and Data-Free Techniques to Address Local and Global Forgetting
Milad Khademi Nori, IL-MIN KIM, Guanghui Wang
Neural LightRig: Unlocking Accurate Object Normal and Material Estimation with Multi-Light Diffusion
Zexin He, Tengfei Wang, Xin Huang et al.
Adaptive Part Learning for Fine-Grained Generalized Category Discovery: A Plug-and-Play Enhancement
Qiyuan Dai, Hanzhuo Huang, Yu Wu et al.
Straight-Line Diffusion Model for Efficient 3D Molecular Generation
Yuyan Ni, Shikun Feng, Haohan Chi et al.
Contextual AD Narration with Interleaved Multimodal Sequence
Hanlin Wang, Zhan Tong, Kecheng Zheng et al.
Generative RLHF-V: Learning Principles from Multi-modal Human Preference
Jiayi Zhou, Jiaming Ji, Boyuan Chen et al.
Circuit Transformer: A Transformer That Preserves Logical Equivalence
Xihan Li, Xing Li, Lei Chen et al.
MLE-STAR: Machine Learning Engineering Agent via Search and Targeted Refinement
Jaehyun Nam, Jinsung Yoon, Jiefeng Chen et al.
StdGEN: Semantic-Decomposed 3D Character Generation from Single Images
Yuze He, Yanning Zhou, Wang Zhao et al.
PostCast: Generalizable Postprocessing for Precipitation Nowcasting via Unsupervised Blurriness Modeling
Junchao Gong, Siwei Tu, Weidong Yang et al.
CrossOver: 3D Scene Cross-Modal Alignment
Sayan Deb Sarkar, Ondrej Miksik, Marc Pollefeys et al.
Don't Just Chase “Highlighted Tokens” in MLLMs: Revisiting Visual Holistic Context Retention
Xin Zou, Di Lu, Yizhou Wang et al.
AnoLLM: Large Language Models for Tabular Anomaly Detection
Che-Ping Tsai, Ganyu Teng, Phillip Wallis et al.
A Simple yet Effective Layout Token in Large Language Models for Document Understanding
Zhaoqing Zhu, Chuwei Luo, Zirui Shao et al.
GFlowVLM: Enhancing Multi-step Reasoning in Vision-Language Models with Generative Flow Networks
Haoqiang Kang, Enna Sachdeva, Piyush Gupta et al.
TR-PTS: Task-Relevant Parameter and Token Selection for Efficient Tuning
Siqi Luo, Haoran Yang, Yi Xin et al.
SkySense V2: A Unified Foundation Model for Multi-modal Remote Sensing
Yingying Zhang, Lixiang Ru, Kang Wu et al.
Inversion Circle Interpolation: Diffusion-based Image Augmentation for Data-scarce Classification
Yanghao Wang, Long Chen
AnyDressing: Customizable Multi-Garment Virtual Dressing via Latent Diffusion Models
Xinghui Li, Qichao Sun, Pengze Zhang et al.
SnapMoGen: Human Motion Generation from Expressive Texts
chuan guo, Inwoo Hwang, Jian Wang et al.
Mitra: Mixed Synthetic Priors for Enhancing Tabular Foundation Models
Xiyuan Zhang, Danielle Maddix Robinson, Junming Yin et al.
BinaryDM: Accurate Weight Binarization for Efficient Diffusion Models
Xingyu Zheng, Xianglong Liu, Haotong Qin et al.
Neural Context Flows for Meta-Learning of Dynamical Systems
Roussel Desmond Nzoyem, David Barton, Tom Deakin
ConTextTab: A Semantics-Aware Tabular In-Context Learner
Marco Spinaci, Marek Polewczyk, Maximilian Schambach et al.
OSCAR: One-Step Diffusion Codec Across Multiple Bit-rates
Jinpei Guo, Yifei Ji, Zheng Chen et al.
Scene Splatter: Momentum 3D Scene Generation from Single Image with Video Diffusion Model
Shengjun Zhang, Jinzhao Li, Xin Fei et al.
Perceptually Accurate 3D Talking Head Generation: New Definitions, Speech-Mesh Representation, and Evaluation Metrics
Lee Chae-Yeon, Oh Hyun-Bin, Han EunGi et al.
Distributed Speculative Inference (DSI): Speculation Parallelism for Provably Faster Lossless Language Model Inference
Nadav Timor, Jonathan Mamou, Daniel Korat et al.
Dynamic-Width Speculative Beam Decoding for LLM Inference
Zongyue Qin, Zifan He, Neha Prakriya et al.
A Comprehensive Evaluation on Event Reasoning of Large Language Models
Zhengwei Tao, Zhi Jin, Yifan Zhang et al.
Adaptive Draft-Verification for Efficient Large Language Model Decoding
Xukun Liu, Bowen Lei, Ruqi Zhang et al.
ChatHuman: Chatting about 3D Humans with Tools
Jing Lin, Yao Feng, Weiyang Liu et al.
ChatGen: Automatic Text-to-Image Generation From FreeStyle Chatting
Chengyou Jia, Changliang Xia, Zhuohang Dang et al.
PostEdit: Posterior Sampling for Efficient Zero-Shot Image Editing
Feng Tian, Yixuan Li, Yichao Yan et al.
OVTR: End-to-End Open-Vocabulary Multiple Object Tracking with Transformer
Jinyang Li, En Yu, Sijia Chen et al.
FreshStack: Building Realistic Benchmarks for Evaluating Retrieval on Technical Documents
Nandan Thakur, Jimmy Lin, Samuel Havens et al.
Mesh-RFT: Enhancing Mesh Generation via Fine-grained Reinforcement Fine-Tuning
Jian Liu, Jing Xu, Song Guo et al.
DNAEdit: Direct Noise Alignment for Text-Guided Rectified Flow Editing
Chenxi Xie, Minghan Li, Shuai Li et al.
Driving by the Rules: A Benchmark for Integrating Traffic Sign Regulations into Vectorized HD Map
Xinyuan Chang, Maixuan Xue, Xinran Liu et al.
IPDreamer: Appearance-Controllable 3D Object Generation with Complex Image Prompts
Bohan Zeng, Shanglin Li, Yutang Feng et al.
SWEA: Updating Factual Knowledge in Large Language Models via Subject Word Embedding Altering
Xiaopeng Li, Shasha Li, Shezheng Song et al.
Have the VLMs Lost Confidence? A Study of Sycophancy in VLMs
Shuo Li, Tao Ji, Xiaoran Fan et al.
SAM2-LOVE: Segment Anything Model 2 in Language-aided Audio-Visual Scenes
Yuji Wang, Haoran Xu, Yong Liu et al.
Extrapolated Urban View Synthesis Benchmark
Xiangyu Han, Zhen Jia, Boyi Li et al.
DataRater: Meta-Learned Dataset Curation
Dan Andrei Calian, Greg Farquhar, Iurii Kemaev et al.
Fast Think-on-Graph: Wider, Deeper and Faster Reasoning of Large Language Model on Knowledge Graph
Xujian Liang, Zhaoquan Gu
Layerwise Recurrent Router for Mixture-of-Experts
Zihan Qiu, Zeyu Huang, Shuang Cheng et al.
Dynamic Typography: Bringing Text to Life via Video Diffusion Prior
Zichen Liu, Yihao Meng, Hao Ouyang et al.
Zero-Shot Novel View and Depth Synthesis with Multi-View Geometric Diffusion
Vitor Guizilini, Muhammad Zubair Irshad, Dian Chen et al.
LLAVIDAL: A Large LAnguage VIsion Model for Daily Activities of Living
Dominick Reilly, Rajatsubhra Chakraborty, Arkaprava Sinha et al.
Hyperbolic Dataset Distillation
Wenyuan Li, Guang Li, Keisuke Maeda et al.
Universal generalization guarantees for Wasserstein distributionally robust models
Tam Le, Jerome Malick
Distilling Structured Rationale from Large Language Models to Small Language Models for Abstractive Summarization
Linyong Wang, Lianwei Wu, Shaoqi Song et al.
FlowR: Flowing from Sparse to Dense 3D Reconstructions
Tobias Fischer, Samuel Rota Bulò, Yung-Hsu Yang et al.
Efficient Active Imitation Learning with Random Network Distillation
Emilien Biré, Anthony Kobanda, Ludovic Denoyer et al.
GaussRender: Learning 3D Occupancy with Gaussian Rendering
Loick Chambon, Eloi Zablocki, Alexandre Boulch et al.
Extrapolation by Association: Length Generalization Transfer In Transformers
Ziyang Cai, Nayoung Lee, Avi Schwarzschild et al.
Understanding and Mitigating Bottlenecks of State Space Models through the Lens of Recency and Over-smoothing
Peihao Wang, Ruisi Cai, Yuehao Wang et al.