Most Cited 2025 "greedy impurity-reduction" Papers
22,274 papers found • Page 12 of 112
Conference
LeanAgent: Lifelong Learning for Formal Theorem Proving
Adarsh Kumarappan, Mohit Tiwari, Peiyang Song et al.
Accelerating RL for LLM Reasoning with Optimal Advantage Regression
Kianté Brantley, Mingyu Chen, Zhaolin Gao et al.
METASCENES: Towards Automated Replica Creation for Real-world 3D Scans
Huangyue Yu, Baoxiong Jia, Yixin Chen et al.
CoRA: Collaborative Information Perception by Large Language Model’s Weights for Recommendation
Yuting Liu, Jinghao Zhang, Yizhou Dang et al.
Imputation for prediction: beware of diminishing returns.
Marine Le Morvan, Gael Varoquaux
Stable Segment Anything Model
Qi Fan, Xin Tao, Lei Ke et al.
GeoBEV: Learning Geometric BEV Representation for Multi-view 3D Object Detection
Jinqing Zhang, Yanan Zhang, Yunlong Qi et al.
MVPortrait: Text-Guided Motion and Emotion Control for Multi-view Vivid Portrait Animation
Yukang Lin, Hokit Fung, Jianjin Xu et al.
Yuan: Yielding Unblemished Aesthetics Through a Unified Network for Visual Imperfections Removal in Generated Images
Zhenyu Yu, Chee Seng Chan
When Large Vision-Language Model Meets Large Remote Sensing Imagery: Coarse-to-Fine Text-Guided Token Pruning
Junwei Luo, Yingying Zhang, Xue Yang et al.
VinTAGe: Joint Video and Text Conditioning for Holistic Audio Generation
Saksham Singh Kushwaha, Yapeng Tian
LUDVIG: Learning-Free Uplifting of 2D Visual Features to Gaussian Splatting Scenes
Juliette Marrie, Romain Menegaux, Michael Arbel et al.
LeanQuant: Accurate and Scalable Large Language Model Quantization with Loss-error-aware Grid
Tianyi Zhang, Anshumali Shrivastava
Galileo: Learning Global & Local Features of Many Remote Sensing Modalities
Gabriel Tseng, Anthony Fuller, Marlena Reil et al.
Puppeteer: Rig and Animate Your 3D Models
Chaoyue Song, Xiu Li, Fan Yang et al.
SymmCompletion: High-Fidelity and High-Consistency Point Cloud Completion with Symmetry Guidance
Hongyu Yan, Zijun Li, Kunming Luo et al.
PSHuman: Photorealistic Single-image 3D Human Reconstruction using Cross-Scale Multiview Diffusion and Explicit Remeshing
Peng Li, Wangguandong Zheng, Yuan Liu et al.
Searching Latent Program Spaces
Matthew Macfarlane, Clem Bonnet
MAP: Multi-Human-Value Alignment Palette
Xinran Wang, Qi Le, Ammar Ahmed et al.
Generative Classifiers Avoid Shortcut Solutions
Alexander Li, Ananya Kumar, Deepak Pathak
CROW: Eliminating Backdoors from Large Language Models via Internal Consistency Regularization
Nay Myat Min, Long H. Pham, Yige Li et al.
Compositional 4D Dynamic Scenes Understanding with Physics Priors for Video Question Answering
Xingrui Wang, Wufei Ma, Angtian Wang et al.
Image Generation Diversity Issues and How to Tame Them
Mischa Dombrowski, Weitong Zhang, Hadrien Reynaud et al.
OLiDM: Object-aware LiDAR Diffusion Models for Autonomous Driving
Tianyi Yan, Junbo Yin, Xianpeng Lang et al.
Vision Transformers Don't Need Trained Registers
Nicholas Jiang, Amil Dravid, Alexei Efros et al.
Linguini: A benchmark for language-agnostic linguistic reasoning
Eduardo Sánchez, Belen Alastruey, Christophe Ropers et al.
Omni-R1: Reinforcement Learning for Omnimodal Reasoning via Two-System Collaboration
Hao Zhong, Muzhi Zhu, Zongze Du et al.
Weighted-Reward Preference Optimization for Implicit Model Fusion
Ziyi Yang, Fanqi Wan, Longguang Zhong et al.
UniVG: A Generalist Diffusion Model for Unified Image Generation and Editing
Tsu-Jui Fu, Yusu Qian, Chen Chen et al.
Focus on Local: Finding Reliable Discriminative Regions for Visual Place Recognition
Changwei Wang, Shunpeng Chen, Yukun Song et al.
CoT Red-Handed: Stress Testing Chain-of-Thought Monitoring
Benjamin Arnav, Pablo Bernabeu-Perez, Nathan Helm-Burger et al.
Formation of Representations in Neural Networks
Liu Ziyin, Isaac Chuang, Tomer Galanti et al.
Unsupervised Foundation Model-Agnostic Slide-Level Representation Learning
Tim Lenz, Peter Neidlinger, Marta Ligero et al.
MDP3: A Training-free Approach for List-wise Frame Selection in Video-LLMs
Hui Sun, Shiyin Lu, Huanyu Wang et al.
Brain Mapping with Dense Features: Grounding Cortical Semantic Selectivity in Natural Images With Vision Transformers
Andrew Luo, Jacob Yeung, Rushikesh Zawar et al.
DreamDance: Animating Human Images by Enriching 3D Geometry Cues from 2D Poses
Yatian Pang, Bin Zhu, Bin Lin et al.
Defeasible Visual Entailment: Benchmark, Evaluator, and Reward-Driven Optimization
Yue Zhang, Liqiang Jing, Vibhav Gogate
Patch-wise Structural Loss for Time Series Forecasting
Dilfira Kudrat, Zongxia Xie, Yanru Sun et al.
Learning Hazing to Dehazing: Towards Realistic Haze Generation for Real-World Image Dehazing
Ruiyi Wang, Yushuo Zheng, Zicheng Zhang et al.
UniMuMo: Unified Text, Music, and Motion Generation
Han Yang, Kun Su, Yutong Zhang et al.
Exploring the Effectiveness of Object-Centric Representations in Visual Question Answering: Comparative Insights with Foundation Models
Amir Mohammad Karimi Mamaghan, Samuele Papa, Karl H. Johansson et al.
Coreset Selection via Reducible Loss in Continual Learning
Ruilin Tong, Yuhang Liu, Javen Qinfeng Shi et al.
InstantSplamp: Fast and Generalizable Stenography Framework for Generative Gaussian Splatting
Chenxin Li, Hengyu Liu, Zhiwen Fan et al.
Accelerating Large Language Model Reasoning via Speculative Search
Zhihai Wang, Jie Wang, Jilai Pan et al.
Generative Planning with 3D-Vision Language Pre-training for End-to-End Autonomous Driving
Tengpeng Li, Hanli Wang, Xianfei Li et al.
DexHandDiff: Interaction-aware Diffusion Planning for Adaptive Dexterous Manipulation
Zhixuan Liang, Yao Mu, Yixiao Wang et al.
SmartEraser: Remove Anything from Images using Masked-Region Guidance
Longtao Jiang, Zhendong Wang, Jianmin Bao et al.
RoomTour3D: Geometry-Aware Video-Instruction Tuning for Embodied Navigation
Mingfei Han, Liang Ma, Kamila Zhumakhanova et al.
The Crystal Ball Hypothesis in diffusion models: Anticipating object positions from initial noise
Yuanhao Ban, Ruochen Wang, Tianyi Zhou et al.
Learning Molecular Representation in a Cell
Gang Liu, Srijit Seal, John Arevalo et al.
ReasonGrounder: LVLM-Guided Hierarchical Feature Splatting for Open-Vocabulary 3D Visual Grounding and Reasoning
Zhenyang Liu, Yikai Wang, Sixiao Zheng et al.
CyberPal.AI: Empowering LLMs with Expert-Driven Cybersecurity Instructions
Matan Levi, Yair Allouche, Daniel Ohayon et al.
Mobile Video Diffusion
Haitam Ben Yahia, Denis Korzhenkov, Ioannis Lelekas et al.
Teaching Language Models to Evolve with Users: Dynamic Profile Modeling for Personalized Alignment
Weixiang Zhao, Xingyu Sui, Yulin Hu et al.
Unlocking Efficient, Scalable, and Continual Knowledge Editing with Basis-Level Representation Fine-Tuning
Tianci Liu, Ruirui Li, Yunzhe Qi et al.
Quest: Query-centric Data Synthesis Approach for Long-context Scaling of Large Language Model
Chaochen Gao, Xing W, Qi Fu et al.
Fine-grained Abnormality Prompt Learning for Zero-shot Anomaly Detection
Jiawen Zhu, YEW-SOON ONG, Chunhua Shen et al.
Rethinking Training for De-biasing Text-to-Image Generation: Unlocking the Potential of Stable Diffusion
Eunji Kim, Siwon Kim, Minjun Park et al.
An OpenMind for 3D Medical Vision Self-supervised Learning
Tassilo Wald, Constantin Ulrich, Jonathan Suprijadi et al.
STEREO: A Two-Stage Framework for Adversarially Robust Concept Erasing from Text-to-Image Diffusion Models
Koushik Srivatsan, Fahad Shamshad, Muzammal Naseer et al.
Rethinking Spiking Neural Networks from an Ensemble Learning Perspective
Yongqi Ding, Lin Zuo, Mengmeng Jing et al.
MotionLab: Unified Human Motion Generation and Editing via the Motion-Condition-Motion Paradigm
Ziyan Guo, Zeyu HU, Na Zhao et al.
Point-RFT: Improving Multimodal Reasoning with Visually Grounded Reinforcement Finetuning
Minheng Ni, Zhengyuan Yang, Linjie Li et al.
SALAD: Skeleton-aware Latent Diffusion for Text-driven Motion Generation and Editing
Seokhyeon Hong, Chaelin Kim, Serin Yoon et al.
ICE-Bench: A Unified and Comprehensive Benchmark for Image Creating and Editing
Yulin Pan, Xiangteng He, Chaojie Mao et al.
Mixture of Attentions For Speculative Decoding
Matthieu Zimmer, Milan Gritta, Gerasimos Lampouras et al.
PaTH Attention: Position Encoding via Accumulating Householder Transformations
Songlin Yang, Yikang Shen, Kaiyue Wen et al.
RealGeneral: Unifying Visual Generation via Temporal In-Context Learning with Video Models
Yijing Lin, Mengqi Huang, Shuhan Zhuang et al.
Uncertainty-guided Perturbation for Image Super-Resolution Diffusion Model
Leheng Zhang, Weiyi You, Kexuan Shi et al.
TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models
Makoto Shing, Kou Misaki, Han Bao et al.
Generalized Principal-Agent Problem with a Learning Agent
Tao Lin, Yiling Chen
Vec2Face: Scaling Face Dataset Generation with Loosely Constrained Vectors
Haiyu Wu, Jaskirat Singh, Sicong Tian et al.
Fully-inductive Node Classification on Arbitrary Graphs
Jianan Zhao, Zhaocheng Zhu, Mikhail Galkin et al.
NuPlanQA: A Large-Scale Dataset and Benchmark for Multi-View Driving Scene Understanding in Multi-Modal Large Language Models
Sung-Yeon Park, Can Cui, Yunsheng Ma et al.
Revisiting Zeroth-Order Optimization: Minimum-Variance Two-Point Estimators and Directionally Aligned Perturbations
Shaocong Ma, Heng Huang
SQLFixAgent: Towards Semantic-Accurate Text-to-SQL Parsing via Consistency-Enhanced Multi-Agent Collaboration
Jipeng Cen, Jiaxin Liu, Zhixu Li et al.
Rethinking Diffusion Posterior Sampling: From Conditional Score Estimator to Maximizing a Posterior
Tongda Xu, Xiyan Cai, Xinjie Zhang et al.
Back on Track: Bundle Adjustment for Dynamic Scene Reconstruction
Weirong Chen, Ganlin Zhang, Felix Wimbauer et al.
GS-LiDAR: Generating Realistic LiDAR Point Clouds with Panoramic Gaussian Splatting
Junzhe Jiang, Chun Gu, Yurui Chen et al.
LongSplat: Robust Unposed 3D Gaussian Splatting for Casual Long Videos
Chin-Yang Lin, Cheng Sun, Fu-En Yang et al.
TabDPT: Scaling Tabular Foundation Models on Real Data
Junwei Ma, Valentin Thomas, Rasa Hosseinzadeh et al.
Multi-Granular Spatio-Temporal Token Merging for Training-Free Acceleration of Video LLMs
Jeongseok Hyun, Sukjun Hwang, Su Ho Han et al.
Inference Optimal VLMs Need Fewer Visual Tokens and More Parameters
Kevin Li, Sachin Goyal, João D Semedo et al.
Consistent Flow Distillation for Text-to-3D Generation
runjie yan, Yinbo Chen, Xiaolong Wang
Post-hoc Reward Calibration: A Case Study on Length Bias
Zeyu Huang, Zihan Qiu, zili wang et al.
GASP: Efficient Black-Box Generation of Adversarial Suffixes for Jailbreaking LLMs
Advik Basani, Xiao Zhang
Panacea: Mitigating Harmful Fine-tuning for Large Language Models via Post-fine-tuning Perturbation
Yibo Wang, Tiansheng Huang, Li Shen et al.
RelCon: Relative Contrastive Learning for a Motion Foundation Model for Wearable Data
Maxwell Xu, Jaya Narain, Gregory Darnell et al.
CLIMB: Data Foundations for Large Scale Multimodal Clinical Foundation Models
David Dai, Peilin Chen, Malinda Lu et al.
Ambient Diffusion Omni: Training Good Models with Bad Data
Giannis Daras, Adrian Rodriguez-Munoz, Adam Klivans et al.
Reward-Guided Iterative Refinement in Diffusion Models at Test-Time with Applications to Protein and DNA Design
Masatoshi Uehara, su, Yulai Zhao et al.
Lie Algebra Canonicalization: Equivariant Neural Operators under arbitrary Lie Groups
Zakhar Shumaylov, Peter Zaika, James Rowbottom et al.
Revisiting Prefix-tuning: Statistical Benefits of Reparameterization among Prompts
Minh Le, Chau Nguyen, Huy Nguyen et al.
$\mathbb{X}$-Sample Contrastive Loss: Improving Contrastive Learning with Sample Similarity Graphs
Vlad Sobal, Mark Ibrahim, Randall Balestriero et al.
The Illusion of Unlearning: The Unstable Nature of Machine Unlearning in Text-to-Image Diffusion Models
Naveen George, Karthik Nandan Dasaraju, Rutheesh Reddy Chittepu et al.
Backdoor Cleaning without External Guidance in MLLM Fine-tuning
Xuankun Rong, Wenke Huang, Jian Liang et al.
Prototype-Based Image Prompting for Weakly Supervised Histopathological Image Segmentation
Qingchen Tang, Lei Fan, Maurice Pagnucco et al.
Mr. DETR: Instructive Multi-Route Training for Detection Transformers
Chang-Bin Zhang, Yujie Zhong, Kai Han
MoDec-GS: Global-to-Local Motion Decomposition and Temporal Interval Adjustment for Compact Dynamic 3D Gaussian Splatting
Sangwoon Kwak, Joonsoo Kim, Jun Young Jeong et al.
How to Synthesize Text Data without Model Collapse?
Xuekai Zhu, Daixuan Cheng, Hengli Li et al.
Repulsive Latent Score Distillation for Solving Inverse Problems
Nicolas Zilberstein, Morteza Mardani, Santiago Segarra
RAT: Adversarial Attacks on Deep Reinforcement Agents for Targeted Behaviors
Fengshuo Bai, Runze Liu, Yali Du et al.
EgoLM: Multi-Modal Language Model of Egocentric Motions
Fangzhou Hong, Vladimir Guzov, Hyo Jin Kim et al.
Joint MoE Scaling Laws: Mixture of Experts Can Be Memory Efficient
Jan Ludziejewski, Maciej Pióro, Jakub Krajewski et al.
HOIGen-1M: A Large-scale Dataset for Human-Object Interaction Video Generation
Kun Liu, Qi Liu, Xinchen Liu et al.
DRiVE: Diffusion-based Rigging Empowers Generation of Versatile and Expressive Characters
Mingze Sun, Junting Dong, Junhao Chen et al.
A Unifying Framework for Representation Learning
Shaden Alshammari, John Hershey, Axel Feldmann et al.
SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction
Ling Yang, Zhaochen Yu, Tianjun Zhang et al.
AKiRa: Augmentation Kit on Rays for Optical Video Generation
Xi Wang, Robin Courant, Marc Christie et al.
AdaManip: Adaptive Articulated Object Manipulation Environments and Policy Learning
Yuanfei Wang, Xiaojie Zhang, Ruihai Wu et al.
Creation-MMBench: Assessing Context-Aware Creative Intelligence in MLLMs
Xinyu Fang, Zhijian Chen, Kai Lan et al.
VisionArena: 230k Real World User-VLM Conversations with Preference Labels
Christopher Chou, Lisa Dunlap, Wei-Lin Chiang et al.
Direct Discriminative Optimization: Your Likelihood-Based Visual Generative Model is Secretly a GAN Discriminator
Kaiwen Zheng, Yongxin Chen, Huayu Chen et al.
Completion as Enhancement: A Degradation-Aware Selective Image Guided Network for Depth Completion
Zhiqiang Yan, Zhengxue Wang, Kun Wang et al.
Fine-Grained Evaluation of Large Vision-Language Models in Autonomous Driving
Yue Li, Meng Tian, Zhenyu Lin et al.
LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization
Guanzheng Chen, Xin Li, Michael Qizhe Shieh et al.
CoMM: A Coherent Interleaved Image-Text Dataset for Multimodal Understanding and Generation
Wei Chen, Lin Li, Yongqi Yang et al.
Spatial-Temporal Graph Diffusion Policy with Kinematic Modeling for Bimanual Robotic Manipulation
Qi Lv, Hao Li, Xiang Deng et al.
Bag of Tricks for Inference-time Computation of LLM Reasoning
Fan LIU, Wen-Shuo Chao, Naiqiang Tan et al.
RealRAG: Retrieval-augmented Realistic Image Generation via Self-reflective Contrastive Learning
Yuanhuiyi Lyu, Xu Zheng, Lutao Jiang et al.
Zero-Shot Monocular Scene Flow Estimation in the Wild
Yiqing Liang, Abhishek Badki, Hang Su et al.
CL-Attack: Textual Backdoor Attacks via Cross-Lingual Triggers
Jingyi Zheng, Tianyi Hu, Tianshuo Cong et al.
CLIMB-ReID: A Hybrid CLIP-Mamba Framework for Person Re-Identification
Chenyang Yu, Xuehu Liu, Jiawen Zhu et al.
Multi-Teacher Knowledge Distillation with Reinforcement Learning for Visual Recognition
Chuanguang Yang, XinQiang Yu, Han Yang et al.
SVG: 3D Stereoscopic Video Generation via Denoising Frame Matrix
Peng Dai, Feitong Tan, Qiangeng Xu et al.
Adversarial Attacks against Closed-Source MLLMs via Feature Optimal Alignment
Xiaojun Jia, Sensen Gao, Simeng Qin et al.
Debiased All-in-one Image Restoration with Task Uncertainty Regularization
Gang Wu, Junjun Jiang, Yijun Wang et al.
Learning Transformer-based World Models with Contrastive Predictive Coding
Maxime Burchi, Radu Timofte
DualTalk: Dual-Speaker Interaction for 3D Talking Head Conversations
Ziqiao Peng, Yanbo Fan, Haoyu Wu et al.
Imagine360: Immersive 360 Video Generation from Perspective Anchor
Jing Tan, Shuai Yang, Tong Wu et al.
MotionPro: A Precise Motion Controller for Image-to-Video Generation
Zhongwei Zhang, Fuchen Long, Zhaofan Qiu et al.
Detail Matters: Mamba-Inspired Joint Unfolding Network for Snapshot Spectral Compressive Imaging
Mengjie Qin, Yuchao Feng, Zongliang Wu et al.
Active Evaluation Acquisition for Efficient LLM Benchmarking
Yang Li, Jie Ma, Miguel Ballesteros et al.
KVzip: Query-Agnostic KV Cache Compression with Context Reconstruction
Jang-Hyun Kim, Jinuk Kim, Sangwoo Kwon et al.
AWRaCLe: All-Weather Image Restoration Using Visual In-Context Learning
Sudarshan Rajagopalan, Vishal M. Patel
NAVIX: Scaling MiniGrid Environments with JAX
Eduardo Pignatelli, Jarek Liesen, Robert Lange et al.
Trusted Multi-View Classification via Evolutionary Multi-View Fusion
Xinyan Liang, Pinhan Fu, Yuhua Qian et al.
StarGen: A Spatiotemporal Autoregression Framework with Video Diffusion Model for Scalable and Controllable Scene Generation
Shangjin Zhai, Zhichao Ye, Jialin Liu et al.
Nested Learning: The Illusion of Deep Learning Architectures
Ali Behrouz, Meisam Razaviyayn, Peilin Zhong et al.
Flow matching achieves almost minimax optimal convergence
Kenji Fukumizu, Taiji Suzuki, Noboru Isobe et al.
In Search of Adam’s Secret Sauce
Antonio Orvieto, Robert Gower
CoTFormer: A Chain of Thought Driven Architecture with Budget-Adaptive Computation Cost at Inference
Amirkeivan Mohtashami, Matteo Pagliardini, Martin Jaggi
Solving New Tasks by Adapting Internet Video Knowledge
Calvin Luo, Zilai Zeng, Yilun Du et al.
The Power of Context: How Multimodality Improves Image Super-Resolution
Kangfu Mei, Vishal M. Patel, Mojtaba Sahraee-Ardakan et al.
Scaling Vision Pre-Training to 4K Resolution
Baifeng Shi, Boyi Li, Han Cai et al.
EasyHOI: Unleashing the Power of Large Models for Reconstructing Hand-Object Interactions in the Wild
Yumeng Liu, Xiaoxiao Long, Zemin Yang et al.
MG-MotionLLM: A Unified Framework for Motion Comprehension and Generation across Multiple Granularities
Bizhu Wu, Jinheng Xie, Keming Shen et al.
Can Watermarked LLMs be Identified by Users via Crafted Prompts?
Aiwei Liu, Sheng Guan, Yiming Liu et al.
MVGenMaster: Scaling Multi-View Generation from Any Image via 3D Priors Enhanced Diffusion Model
Chenjie Cao, Chaohui Yu, Shang Liu et al.
Harnessing Vision Foundation Models for High-Performance, Training-Free Open Vocabulary Segmentation
Yuheng Shi, Minjing Dong, Chang Xu
DisasterM3: A Remote Sensing Vision-Language Dataset for Disaster Damage Assessment and Response
Junjue Wang, Weihao Xuan, Heli Qi et al.
TaoAvatar: Real-Time Lifelike Full-Body Talking Avatars for Augmented Reality via 3D Gaussian Splatting
Jianchuan Chen, Jingchuan Hu, Gaige Wang et al.
The Computational Complexity of Circuit Discovery for Inner Interpretability
Federico Adolfi, Martina G. Vilas, Todd Wareham
SeaS: Few-shot Industrial Anomaly Image Generation with Separation and Sharing Fine-tuning
Zhewei Dai, Shilei Zeng, Haotian Liu et al.
FALCON: Resolving Visual Redundancy and Fragmentation in High-resolution Multimodal Large Language Models via Visual Registers
Renshan Zhang, Rui Shao, Gongwei Chen et al.
Deep Kernel Relative Test for Machine-generated Text Detection
Yiliao Song, Zhenqiao Yuan, Shuhai Zhang et al.
Skill Expansion and Composition in Parameter Space
Tenglong Liu, Jianxiong Li, Yinan Zheng et al.
Solver-Informed RL: Grounding Large Language Models for Authentic Optimization Modeling
Yitian Chen, Jingfan Xia, Siyu Shao et al.
DG-Mamba: Robust and Efficient Dynamic Graph Structure Learning with Selective State Space Models
Haonan Yuan, Qingyun Sun, Zhaonan Wang et al.
Learning Robust Stereo Matching in the Wild with Selective Mixture-of-Experts
Yun Wang, Longguang Wang, Chenghao Zhang et al.
Breaking Latent Prior Bias in Detectors for Generalizable AIGC Image Detection
Yue Zhou, Xinan He, Kaiqing Lin et al.
Glad: A Streaming Scene Generator for Autonomous Driving
Bin Xie, Yingfei Liu, Tiancai Wang et al.
Neural Sampling from Boltzmann Densities: Fisher-Rao Curves in the Wasserstein Geometry
Jannis Chemseddine, Christian Wald, Richard Duong et al.
Conformal Thresholded Intervals for Efficient Regression
Rui Luo, Zhixin Zhou
Envisioning Class Entity Reasoning by Large Language Models for Few-shot Learning
Mushui Liu, Fangtai Wu, Bozheng Li et al.
SharpDepth: Sharpening Metric Depth Predictions Using Diffusion Distillation
Duc-Hai Pham, Tung Do, Phong Nguyen et al.
OmniCount: Multi-label Object Counting with Semantic-Geometric Priors
Anindya Mondal, Sauradip Nag, Xiatian Zhu et al.
Stabilizing Reinforcement Learning in Differentiable Multiphysics Simulation
Eliot Xing, Vernon Luk, Jean Oh
Does Your Vision-Language Model Get Lost in the Long Video Sampling Dilemma?
Tianyuan Qu, Longxiang Tang, Bohao PENG et al.
BadVLA: Towards Backdoor Attacks on Vision-Language-Action Models via Objective-Decoupled Optimization
Xueyang Zhou, Guiyao Tie, Guowen Zhang et al.
Battling the Non-stationarity in Time Series Forecasting via Test-time Adaptation
HyunGi Kim, Siwon Kim, Jisoo Mok et al.
CoRe: Benchmarking LLMs’ Code Reasoning Capabilities through Static Analysis Tasks
Danning Xie, Mingwei Zheng, Xuwei Liu et al.
UniMLVG: Unified Framework for Multi-view Long Video Generation with Comprehensive Control Capabilities for Autonomous Driving
Rui Chen, Zehuan Wu, Yichen Liu et al.
Sparse Autoencoders Reveal Temporal Difference Learning in Large Language Models
Can Demircan, Tankred Saanum, Akshay Jagadish et al.
Once-for-All: Controllable Generative Image Compression with Dynamic Granularity Adaptation
Anqi Li, Feng Li, Yuxi Liu et al.
SwiftEdit: Lightning Fast Text-Guided Image Editing via One-Step Diffusion
Trong-Tung Nguyen, Quang Nguyen, Khoi Nguyen et al.
BlobGEN-Vid: Compositional Text-to-Video Generation with Blob Video Representations
Weixi Feng, Chao Liu, Sifei Liu et al.
ViSpeak: Visual Instruction Feedback in Streaming Videos
Shenghao Fu, Qize Yang, Yuan-Ming Li et al.
LBM: Latent Bridge Matching for Fast Image-to-Image Translation
Clément Chadebec, Onur Tasar, Sanjeev Sreetharan et al.
Learning Few-Step Diffusion Models by Trajectory Distribution Matching
Yihong Luo, Tianyang Hu, Jiacheng Sun et al.
Differentiable Optimization of Similarity Scores Between Models and Brains
Nathan Cloos, Moufan Li, Markus Siegel et al.
LoLCATs: On Low-Rank Linearizing of Large Language Models
Michael Zhang, Simran Arora, Rahul Chalamala et al.
Efficiently Democratizing Medical LLMs for 50 Languages via a Mixture of Language Family Experts
Guorui Zheng, Xidong Wang, Juhao Liang et al.
TIGER: Time-frequency Interleaved Gain Extraction and Reconstruction for Efficient Speech Separation
Mohan Xu, Kai Li, Guo Chen et al.
EVolSplat: Efficient Volume-based Gaussian Splatting for Urban View Synthesis
Sheng Miao, Jiaxin Huang, Dongfeng Bai et al.
Proxy Denoising for Source-Free Domain Adaptation
Song Tang, Wenxin Su, Yan Gan et al.
LoRACLR: Contrastive Adaptation for Customization of Diffusion Models
Enis Simsar, Thomas Hofmann, Federico Tombari et al.
KeyDiff: Key Similarity-Based KV Cache Eviction for Long-Context LLM Inference in Resource-Constrained Environments
Junyoung Park, Dalton Jones, Matthew Morse et al.
Context Steering: Controllable Personalization at Inference Time
Zhiyang He, Sashrika Pandey, Mariah Schrum et al.
ICT: Image-Object Cross-Level Trusted Intervention for Mitigating Object Hallucination in Large Vision-Language Models
Junzhe Chen, Tianshu Zhang, Shiyu Huang et al.
ROCKET-1: Mastering Open-World Interaction with Visual-Temporal Context Prompting
Shaofei Cai, Zihao Wang, Kewei Lian et al.
Stochastic Polyak Step-sizes and Momentum: Convergence Guarantees and Practical Performance
Dimitris Oikonomou, Nicolas Loizou
Bridging the User-side Knowledge Gap in Knowledge-aware Recommendations with Large Language Models
Zheng Hu, Zhe Li, Ziyun Jiao et al.
Don’t Think Longer, Think Wisely: Optimizing Thinking Dynamics for Large Reasoning Models
Sohyun An, Ruochen Wang, Tianyi Zhou et al.
Advancing Mathematical Reasoning in Language Models: The Impact of Problem-Solving Data, Data Synthesis Methods, and Training Stages
Zui Chen, Tianqiao Liu, Tongqing et al.
Flash-VStream: Efficient Real-Time Understanding for Long Video Streams
Haoji Zhang, Yiqin Wang, Yansong Tang et al.
Knowledge Localization: Mission Not Accomplished? Enter Query Localization!
Yuheng Chen, Pengfei Cao, Yubo Chen et al.
Node Identifiers: Compact, Discrete Representations for Efficient Graph Learning
Yuankai Luo, Hongkang Li, Qijiong Liu et al.
LoRA3D: Low-Rank Self-Calibration of 3D Geometric Foundation models
Ziqi Lu, Heng Yang, Danfei Xu et al.
Residual Stream Analysis with Multi-Layer SAEs
Tim Lawson, Lucy Farnik, Conor Houghton et al.