Most Cited NEURIPS "adversarial agent" Papers
5,858 papers found • Page 2 of 30
Conference
V2X-Radar: A Multi-modal Dataset with 4D Radar for Cooperative Perception
Lei Yang, Xinyu Zhang, Jun Li et al.
Mind2Web 2: Evaluating Agentic Search with Agent-as-a-Judge
Boyu Gou, Zanming Huang, Yuting Ning et al.
CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up
Songhua Liu, Zhenxiong Tan, Xinchao Wang
HyperGraphRAG: Retrieval-Augmented Generation via Hypergraph-Structured Knowledge Representation
Haoran Luo, Haihong E, Guanting Chen et al.
Efficiently Scaling LLM Reasoning Programs with Certaindex
Yichao Fu, Junda Chen, Siqi Zhu et al.
FlexVAR: Flexible Visual Autoregressive Modeling without Residual Prediction
Siyu Jiao, Gengwei Zhang, Yinlong Qian et al.
TwinMarket: A Scalable Behavioral and Social Simulation for Financial Markets
Yuzhe YANG, Yifei Zhang, Minghao Wu et al.
SeRL: Self-play Reinforcement Learning for Large Language Models with Limited Data
Wenkai Fang, Shunyu Liu, Yang Zhou et al.
REAL: Benchmarking Autonomous Agents on Deterministic Simulations of Real Websites
Div Garg, Diego Caples, Andis Draguns et al.
Think Silently, Think Fast: Dynamic Latent Compression of LLM Reasoning Chains
Wenhui Tan, Jiaze Li, Jianzhong Ju et al.
On the Closed-Form of Flow Matching: Generalization Does Not Arise from Target Stochasticity
Quentin Bertrand, Anne Gagneux, Mathurin Massias et al.
Does Thinking More Always Help? Mirage of Test-Time Scaling in Reasoning Models
Soumya Suvra Ghosal, Souradip Chakraborty, Avinash Reddy et al.
Mechanism Design for LLM Fine-tuning with Multiple Reward Models
Haoran Sun, Yurong Chen, Siwei Wang et al.
Self-Adapting Language Models
Adam Zweiger, Jyo Pari, Han Guo et al.
Nemotron-CLIMB: Clustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training
Shizhe Diao, Yu Yang, Yonggan Fu et al.
Diversity-Aware Policy Optimization for Large Language Model Reasoning
Jian Yao, Ran Cheng, Xingyu Wu et al.
UltraVideo: High-Quality UHD Video Dataset with Comprehensive Captions
Xue zhucun, Jiangning Zhang, Teng Hu et al.
MindOmni: Unleashing Reasoning Generation in Vision Language Models with RGPO
Yicheng Xiao, Lin Song, Yukang Chen et al.
Bridging Distributional and Risk-sensitive Reinforcement Learning with Provable Regret Bounds
Hao Liang, Zhiquan Luo
Wider or Deeper? Scaling LLM Inference-Time Compute with Adaptive Branching Tree Search
Yuichi Inoue, Kou Misaki, Yuki Imajuku et al.
MoonCast: High-Quality Zero-Shot Podcast Generation
Zeqian Ju, Dongchao Yang, Shen Kai et al.
Raw2Drive: Reinforcement Learning with Aligned World Models for End-to-End Autonomous Driving (in CARLA v2)
Zhenjie Yang, Xiaosong Jia, Qifeng Li et al.
SSR: Enhancing Depth Perception in Vision-Language Models via Rationale-Guided Spatial Reasoning
Yang Liu, Ming Ma, Xiaomin Yu et al.
Beyond Attention or Similarity: Maximizing Conditional Diversity for Token Pruning in MLLMs
Qizhe Zhang, Mengzhen Liu, Lichen Li et al.
SiriuS: Self-improving Multi-agent Systems via Bootstrapped Reasoning
Wanjia Zhao, Mert Yuksekgonul, Shirley Wu et al.
OS-Harm: A Benchmark for Measuring Safety of Computer Use Agents
Thomas Kuntz, Agatha Duzan, Hao Zhao et al.
VLM-R³: Region Recognition, Reasoning, and Refinement for Enhanced Multimodal Chain-of-Thought
Chaoya Jiang, Yongrui Heng, Wei Ye et al.
Merging on the Fly Without Retraining: A Sequential Approach to Scalable Continual Model Merging
Anke Tang, Enneng Yang, Li Shen et al.
HoliTom: Holistic Token Merging for Fast Video Large Language Models
Kele Shao, Keda TAO, Can Qin et al.
Learning from Reward-Free Offline Data: A Case for Planning with Latent Dynamics Models
Uladzislau Sobal, Wancong Zhang, Kyunghyun Cho et al.
Logic-in-Frames: Dynamic Keyframe Search via Visual Semantic-Logical Verification for Long Video Understanding
Weiyu Guo, Ziyang Chen, Shaoguang WANG et al.
Deep Video Discovery: Agentic Search with Tool Use for Long-form Video Understanding
Xiaoyi Zhang, Zhaoyang Jia, Zongyu Guo et al.
Beyond Chemical QA: Evaluating LLM's Chemical Reasoning with Modular Chemical Operations
Li Hao, He CAO, Bin Feng et al.
MLLMs Need 3D-Aware Representation Supervision for Scene Understanding
Xiaohu Huang, Jingjing Wu, Qunyi Xie et al.
GraphMaster: Automated Graph Synthesis via LLM Agents in Data-Limited Environments
Enjun Du, Xunkai Li, Tian Jin et al.
Learning 3D Persistent Embodied World Models
Siyuan Zhou, Yilun Du, Yuncong Yang et al.
Fine-Tuning Discrete Diffusion Models with Policy Gradient Methods
Oussama Zekri, Nicolas Boulle
Sloth: scaling laws for LLM skills to predict multi-benchmark performance across families
Felipe Maia Polo, Seamus Somerstep, Leshem Choshen et al.
GEM: Empowering MLLM for Grounded ECG Understanding with Time Series and Images
Xiang Lan, Feng Wu, Kai He et al.
VITA-Audio: Fast Interleaved Audio-Text Token Generation for Efficient Large Speech-Language Model
Zuwei Long, Yunhang Shen, Chaoyou Fu et al.
GuardReasoner-VL: Safeguarding VLMs via Reinforced Reasoning
Yue Liu, Shengfang Zhai, Mingzhe Du et al.
xLSTM-Mixer: Multivariate Time Series Forecasting by Mixing via Scalar Memories
Maurice Kraus, Felix Divo, Devendra Singh Dhami et al.
SensorLM: Learning the Language of Wearable Sensors
Yuwei Zhang, Kumar Ayush, Siyuan Qiao et al.
FastVID: Dynamic Density Pruning for Fast Video Large Language Models
Leqi Shen, Guoqiang Gong, Tao He et al.
WebGen-Bench: Evaluating LLMs on Generating Interactive and Functional Websites from Scratch
Zimu Lu, Yunqiao Yang, Houxing Ren et al.
VeriThinker: Learning to Verify Makes Reasoning Model Efficient
Zigeng Chen, Xinyin Ma, Gongfan Fang et al.
Roboflow100-VL: A Multi-Domain Object Detection Benchmark for Vision-Language Models
Matvei Popov, Peter Robicheaux, Anish Madan et al.
BioCLIP 2: Emergent Properties from Scaling Hierarchical Contrastive Learning
Jianyang Gu, Sam Stevens, Elizabeth Campolongo et al.
Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation
Xinyu Yang, Yuwei An, Hongyi Liu et al.
ThinkSound: Chain-of-Thought Reasoning in Multimodal LLMs for Audio Generation and Editing
Huadai Liu, Kaicheng Luo, Jialei Wang et al.
Exploring the Limits of Vision-Language-Action Manipulation in Cross-task Generalization
Jiaming Zhou, Ke Ye, Jiayi Liu et al.
Mixture of Noise for Pre-Trained Model-Based Class-Incremental Learning
Kai Jiang, Zhengyan Shi, Dell Zhang et al.
Stationary Kernels and Gaussian Processes on Lie Groups and their Homogeneous Spaces II: non-compact symmetric spaces
Iskander Azangulov, Andrei Smolensky, Alexander Terenin et al.
Efficient Part-level 3D Object Generation via Dual Volume Packing
Jiaxiang Tang, Ruijie Lu, Max Li et al.
Memory Injection Attacks on LLM Agents via Query-Only Interaction
Shen Dong, Shaochen Xu, Pengfei He et al.
Power Lines: Scaling laws for weight decay and batch size in LLM pre-training
Shane Bergsma, Nolan Dey, Gurpreet Gosal et al.
Reinforcement Learning Finetunes Small Subnetworks in Large Language Models
Sagnik Mukherjee, Lifan Yuan, Dilek Hakkani-Tur et al.
Scientists' First Exam: Probing Cognitive Abilities of MLLM via Perception, Understanding, and Reasoning
Yuhao Zhou, Yiheng Wang, Xuming He et al.
AutoPartGen: Autoregressive 3D Part Generation and Discovery
Minghao Chen, Jianyuan Wang, Roman Shapovalov et al.
Jet-Nemotron: Efficient Language Model with Post Neural Architecture Search
Yuxian Gu, Qinghao Hu, Haocheng Xi et al.
RelationAdapter: Learning and Transferring Visual Relation with Diffusion Transformers
Yan Gong, Yiren Song, Yicheng Li et al.
AutoRedTeamer: Autonomous Red Teaming with Lifelong Attack Integration
Andy Zhou, Kevin Wu, Francesco Pinto et al.
OpenVLThinker: Complex Vision-Language Reasoning via Iterative SFT-RL Cycles
Yihe Deng, Hritik Bansal, Fan Yin et al.
AGENTIF: Benchmarking Large Language Models Instruction Following Ability in Agentic Scenarios
Yunjia Qi, Hao Peng, Xiaozhi Wang et al.
MobileUse: A Hierarchical Reflection-Driven GUI Agent for Autonomous Mobile Operation
Ning Li, Xiangmou Qu, Jiamu Zhou et al.
Boosting Generative Image Modeling via Joint Image-Feature Synthesis
Theodoros Kouzelis, Efstathios Karypidis, Ioannis Kakogeorgiou et al.
Prismatic Synthesis: Gradient-based Data Diversification Boosts Generalization in LLM Reasoning
Jaehun Jung, Seungju Han, Ximing Lu et al.
RoboScape: Physics-informed Embodied World Model
Yu Shang, Xin Zhang, Yinzhou Tang et al.
Is Artificial Intelligence Generated Image Detection a Solved Problem?
Ziqiang Li, Jiazhen Yan, Ziwen He et al.
AutoToM: Scaling Model-based Mental Inference via Automated Agent Modeling
Zhining Zhang, Chuanyang Jin, Mung Yao Jia et al.
Training-Free Efficient Video Generation via Dynamic Token Carving
Yuechen Zhang, Jinbo Xing, bin xia et al.
Segment Policy Optimization: Effective Segment-Level Credit Assignment in RL for Large Language Models
Yiran Guo, Lijie Xu, Jie Liu et al.
SAFE: Multitask Failure Detection for Vision-Language-Action Models
Qiao Gu, Yuanliang Ju, Shengxiang Sun et al.
LeVo: High-Quality Song Generation with Multi-Preference Alignment
Shun Lei, Yaoxun XU, ZhiweiLin et al.
AI Research Agents for Machine Learning: Search, Exploration, and Generalization in MLE-bench
Edan Toledo, Karen Hambardzumyan, Martin Josifoski et al.
Dual Data Alignment Makes AI-Generated Image Detector Easier Generalizable
Ruoxin Chen, Junwei Xi, Zhiyuan Yan et al.
Horizon Reduction Makes RL Scalable
Seohong Park, Kevin Frans, Deepinder Mann et al.
The VLLM Safety Paradox: Dual Ease in Jailbreak Attack and Defense
Yangyang Guo, Fangkai Jiao, Liqiang Nie et al.
DINO-Foresight: Looking into the Future with DINO
Efstathios Karypidis, Ioannis Kakogeorgiou, Spyridon Gidaris et al.
Trust, But Verify: A Self-Verification Approach to Reinforcement Learning with Verifiable Rewards
Xiaoyuan Liu, Tian Liang, Zhiwei He et al.
AgentAuditor: Human-level Safety and Security Evaluation for LLM Agents
Hanjun Luo, Shenyu Dai, Chiming Ni et al.
ThinkBench: Dynamic Out-of-Distribution Evaluation for Robust LLM Reasoning
Shulin Huang, Linyi Yang, Yan Song et al.
Pseudo-Labeling for Kernel Ridge Regression under Covariate Shift
Kaizheng Wang
Joint Velocity-Growth Flow Matching for Single-Cell Dynamics Modeling
Dongyi Wang, Yuanwei Jiang, Zhenyi Zhang et al.
STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis
Jiatao Gu, Tianrong Chen, David Berthelot et al.
Universal Cross-Tokenizer Distillation via Approximate Likelihood Matching
Benjamin Minixhofer, Ivan Vulić, Edoardo Maria Ponti
Physics-Constrained Flow Matching: Sampling Generative Models with Hard Constraints
Utkarsh Utkarsh, Pengfei Cai, Alan Edelman et al.
Debate or Vote: Which Yields Better Decisions in Multi-Agent Large Language Models?
Hyeong Kyu Choi, Jerry Zhu, Sharon Li
Implicit Reward as the Bridge: A Unified View of SFT and DPO Connections
Bo Wang, Qinyuan Cheng, Runyu Peng et al.
QiMeng-CodeV-R1: Reasoning-Enhanced Verilog Generation
Yaoyu Zhu, Di Huang, Hanqi Lyu et al.
Code Graph Model (CGM): A Graph-Integrated Large Language Model for Repository-Level Software Engineering Tasks
Hongyuan Tao, Ying Zhang, Zhenhao Tang et al.
Sharp Analysis for KL-Regularized Contextual Bandits and RLHF
Heyang Zhao, Chenlu Ye, Quanquan Gu et al.
Open Vision Reasoner: Transferring Linguistic Cognitive Behavior for Visual Reasoning
Yana Wei, Liang Zhao, Jianjian Sun et al.
ChunkKV: Semantic-Preserving KV Cache Compression for Efficient Long-Context LLM Inference
Xiang Liu, Zhenheng Tang, Peijie Dong et al.
SwS: Self-aware Weakness-driven Problem Synthesis in Reinforcement Learning for LLM Reasoning
Xiao Liang, Zhong-Zhi Li, Yeyun Gong et al.
OpenUnlearning: Accelerating LLM Unlearning via Unified Benchmarking of Methods and Metrics
Vineeth Dorna, Anmol Mekala, Wenlong Zhao et al.
OmniSync: Towards Universal Lip Synchronization via Diffusion Transformers
Ziqiao Peng, Jiwen Liu, Haoxian Zhang et al.
4DGT: Learning a 4D Gaussian Transformer Using Real-World Monocular Videos
Zhen Xu, Zhengqin Li, Zhao Dong et al.
UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Language Interface
Hao Tang, Chen-Wei Xie, Haiyang Wang et al.
Inference-Time Hyper-Scaling with KV Cache Compression
Adrian Łańcucki, Konrad Staniszewski, Piotr Nawrot et al.
REArtGS: Reconstructing and Generating Articulated Objects via 3D Gaussian Splatting with Geometric and Motion Constraints
Di Wu, Liu Liu, Zhou Linli et al.
NFIG: Multi-Scale Autoregressive Image Generation via Frequency Ordering
Zhihao Huang, Xi Qiu, Yukuo Ma et al.
Adversarial Locomotion and Motion Imitation for Humanoid Policy Learning
Jiyuan Shi, Xinzhe Liu, Dewei Wang et al.
Mix Data or Merge Models? Balancing the Helpfulness, Honesty, and Harmlessness of Large Language Model via Model Merging
Jinluan Yang, Dingnan Jin, Anke Tang et al.
MindJourney: Test-Time Scaling with World Models for Spatial Reasoning
Yuncong Yang, Jiageng Liu, Zheyuan Zhang et al.
Emergence and scaling laws in SGD learning of shallow neural networks
Yunwei Ren, Eshaan Nichani, Denny Wu et al.
PipeFusion: Patch-level Pipeline Parallelism for Diffusion Transformers Inference
Jiarui Fang, Jinzhe Pan, Aoyu Li et al.
Conformal Prediction for Causal Effects of Continuous Treatments
Maresa Schröder, Dennis Frauen, Jonas Schweisthal et al.
Classical Planning with LLM-Generated Heuristics: Challenging the State of the Art with Python Code
Augusto B. Corrêa, André G. Pereira, Jendrik Seipp
ATLAS: Autoformalizing Theorems through Lifting, Augmentation, and Synthesis of Data
Xiaoyang Liu, Kangjie Bao, Jiashuo Zhang et al.
Diffusion Model as a Noise-Aware Latent Reward Model for Step-Level Preference Optimization
Tao Zhang, Cheng Da, Kun Ding et al.
MemSim: A Bayesian Simulator for Evaluating Memory of LLM-based Personal Assistants
Zeyu Zhang, Quanyu Dai, Luyu Chen et al.
ReinFlow: Fine-tuning Flow Matching Policy with Online Reinforcement Learning
Tonghe Zhang, Chao Yu, Sichang Su et al.
Detecting High-Stakes Interactions with Activation Probes
Alex McKenzie, Urja Pawar, Phil Blandfort et al.
VideoHallu: Evaluating and Mitigating Multi-modal Hallucinations on Synthetic Video Understanding
Zongxia Li, Xiyang Wu, Guangyao Shi et al.
Let LRMs Break Free from Overthinking via Self-Braking Tuning
Haoran Zhao, Yuchen Yan, Yongliang Shen et al.
UFM: A Simple Path towards Unified Dense Correspondence with Flow
Yuchen Zhang, Nikhil Keetha, Chenwei Lyu et al.
EffiBench-X: A Multi-Language Benchmark for Measuring Efficiency of LLM-Generated Code
Yuhao Qing, Boyu Zhu, Mingzhe Du et al.
The ODE Method for Stochastic Approximation and Reinforcement Learning with Markovian Noise
Shuze Daniel Liu, Shuhang Chen, Shangtong Zhang
Asymmetric REINFORCE for off-Policy Reinforcement Learning: Balancing positive and negative rewards
Charles Arnal, Gaëtan Narozniak, Vivien Cabannes et al.
VideoChat-R1.5: Visual Test-Time Scaling to Reinforce Multimodal Reasoning by Iterative Perception
Ziang Yan, Yinan He, Xinhao Li et al.
Force Prompting: Video Generation Models Can Learn And Generalize Physics-based Control Signals
Nate Gillman, Charles Herrmann, Michael Freeman et al.
Small Batch Size Training for Language Models: When Vanilla SGD Works, and Why Gradient Accumulation is Wasteful
Martin Marek, Sanae Lotfi, Aditya Somasundaram et al.
Language Models Are Capable of Metacognitive Monitoring and Control of Their Internal Activations
Ji-An Li, Huadong Xiong, Robert Wilson et al.
MedAgentBoard: Benchmarking Multi-Agent Collaboration with Conventional Methods for Diverse Medical Tasks
Yinghao Zhu, Ziyi He, Haoran Hu et al.
Topology of Reasoning: Understanding Large Reasoning Models through Reasoning Graph Properties
Gouki Minegishi, Hiroki Furuta, Takeshi Kojima et al.
Enhancing Vision-Language Model Reliability with Uncertainty-Guided Dropout Decoding
Yixiong Fang, Ziran Yang, Zhaorun Chen et al.
Vision Transformers Don't Need Trained Registers
Nicholas Jiang, Amil Dravid, Alexei Efros et al.
KVzip: Query-Agnostic KV Cache Compression with Context Reconstruction
Jang-Hyun Kim, Jinuk Kim, Sangwoo Kwon et al.
Ambient Diffusion Omni: Training Good Models with Bad Data
Giannis Daras, Adrian Rodriguez-Munoz, Adam Klivans et al.
Adversarial Attacks against Closed-Source MLLMs via Feature Optimal Alignment
Xiaojun Jia, Sensen Gao, Simeng Qin et al.
Jasmine: Harnessing Diffusion Prior for Self-supervised Depth Estimation
Jiyuan Wang, Chunyu Lin, cheng guan et al.
Point-RFT: Improving Multimodal Reasoning with Visually Grounded Reinforcement Finetuning
Minheng Ni, Zhengyuan Yang, Linjie Li et al.
Puppeteer: Rig and Animate Your 3D Models
Chaoyue Song, Xiu Li, Fan Yang et al.
PaTH Attention: Position Encoding via Accumulating Householder Transformations
Songlin Yang, Yikang Shen, Kaiyue Wen et al.
Teaching Language Models to Evolve with Users: Dynamic Profile Modeling for Personalized Alignment
Weixiang Zhao, Xingyu Sui, Yulin Hu et al.
GASP: Efficient Black-Box Generation of Adversarial Suffixes for Jailbreaking LLMs
Advik Basani, Xiao Zhang
In Search of Adam’s Secret Sauce
Antonio Orvieto, Robert Gower
Backdoor Cleaning without External Guidance in MLLM Fine-tuning
Xuankun Rong, Wenke Huang, Jian Liang et al.
Heterogeneous Swarms: Jointly Optimizing Model Roles and Weights for Multi-LLM Systems
Shangbin Feng, Zifeng Wang, Palash Goyal et al.
InfiniPot-V: Memory-Constrained KV Cache Compression for Streaming Video Understanding
Minsoo Kim, Kyuhong Shim, Jungwook Choi et al.
Accelerating RL for LLM Reasoning with Optimal Advantage Regression
Kianté Brantley, Mingyu Chen, Zhaolin Gao et al.
Nested Learning: The Illusion of Deep Learning Architectures
Ali Behrouz, Meisam Razaviyayn, Peilin Zhong et al.
BioReason: Incentivizing Multimodal Biological Reasoning within a DNA-LLM Model
Adibvafa Fallahpour, Andrew Magnuson, Purav Gupta et al.
Adaptive Classifier-Free Guidance via Dynamic Low-Confidence Masking
Pengxiang Li, Shilin Yan, Jiayin Cai et al.
Linguini: A benchmark for language-agnostic linguistic reasoning
Eduardo Sánchez, Belen Alastruey, Christophe Ropers et al.
Searching Latent Program Spaces
Matthew Macfarlane, Clem Bonnet
Establishing Best Practices in Building Rigorous Agentic Benchmarks
Yuxuan Zhu, Tengjun Jin, Yada Pruksachatkun et al.
CoT Red-Handed: Stress Testing Chain-of-Thought Monitoring
Benjamin Arnav, Pablo Bernabeu-Perez, Nathan Helm-Burger et al.
Enhancing Personalized Multi-Turn Dialogue with Curiosity Reward
Yanming Wan, Jiaxing Wu, Marwa Abdulhai et al.
Consistent Paths Lead to Truth: Self-Rewarding Reinforcement Learning for LLM Reasoning
Kongcheng Zhang, QI YAO, Shunyu Liu et al.
TabDPT: Scaling Tabular Foundation Models on Real Data
Junwei Ma, Valentin Thomas, Rasa Hosseinzadeh et al.
Imagine360: Immersive 360 Video Generation from Perspective Anchor
Jing Tan, Shuai Yang, Tong Wu et al.
Communication-Efficient Language Model Training Scales Reliably and Robustly: Scaling Laws for DiLoCo
Zachary Charles, Gabriel Teston, Lucio Dery et al.
Bag of Tricks for Inference-time Computation of LLM Reasoning
Fan LIU, Wen-Shuo Chao, Naiqiang Tan et al.
Panacea: Mitigating Harmful Fine-tuning for Large Language Models via Post-fine-tuning Perturbation
Yibo Wang, Tiansheng Huang, Li Shen et al.
Omni-R1: Reinforcement Learning for Omnimodal Reasoning via Two-System Collaboration
Hao Zhong, Muzhi Zhu, Zongze Du et al.
NAVIX: Scaling MiniGrid Environments with JAX
Eduardo Pignatelli, Jarek Liesen, Robert Lange et al.
ReSim: Reliable World Simulation for Autonomous Driving
Jiazhi Yang, Kashyap Chitta, Shenyuan Gao et al.
CoRe: Benchmarking LLMs’ Code Reasoning Capabilities through Static Analysis Tasks
Danning Xie, Mingwei Zheng, Xuwei Liu et al.
Unifying Appearance Codes and Bilateral Grids for Driving Scene Gaussian Splatting
Nan Wang, Lixing Xiao, Yuantao Chen et al.
SEC-bench: Automated Benchmarking of LLM Agents on Real-World Software Security Tasks
Hwiwon Lee, Ziqi Zhang, Hanxiao Lu et al.
GeoLLaVA-8K: Scaling Remote-Sensing Multimodal Large Language Models to 8K Resolution
Fengxiang Wang, Mingshuo Chen, Yueying Li et al.
Artificial Hivemind: The Open-Ended Homogeneity of Language Models (and Beyond)
Liwei Jiang, Yuanjun Chai, Margaret Li et al.
TS-RAG: Retrieval-Augmented Generation based Time Series Foundation Models are Stronger Zero-Shot Forecaster
Kanghui Ning, Zijie Pan, Yu Liu et al.
Touch in the Wild: Learning Fine-Grained Manipulation with a Portable Visuo-Tactile Gripper
Xinyue Zhu, Binghao Huang, Yunzhu Li
KTAE: A Model-Free Algorithm to Key-Tokens Advantage Estimation in Mathematical Reasoning
Wei Sun, Wen Yang, Pu Jian et al.
AI-Researcher: Autonomous Scientific Innovation
Jiabin Tang, Lianghao Xia, Zhonghang Li et al.
PhysCtrl: Generative Physics for Controllable and Physics-Grounded Video Generation
Chen Wang, Chuhao Chen, Yiming Huang et al.
KeyDiff: Key Similarity-Based KV Cache Eviction for Long-Context LLM Inference in Resource-Constrained Environments
Junyoung Park, Dalton Jones, Matthew Morse et al.
Distilling LLM Agent into Small Models with Retrieval and Code Tools
Minki Kang, Jongwon Jeong, Seanie Lee et al.
Large language models can learn and generalize steganographic chain-of-thought under process supervision
ROBERT MC CARTHY, Joey SKAF, Luis Ibanez-Lissen et al.
Twilight: Adaptive Attention Sparsity with Hierarchical Top-$p$ Pruning
Chaofan Lin, Jiaming Tang, Shuo Yang et al.
DisasterM3: A Remote Sensing Vision-Language Dataset for Disaster Damage Assessment and Response
Junjue Wang, Weihao Xuan, Heli Qi et al.
Multi-step Visual Reasoning with Visual Tokens Scaling and Verification
Tianyi Bai, Zengjie Hu, Fupeng Sun et al.
Don’t Think Longer, Think Wisely: Optimizing Thinking Dynamics for Large Reasoning Models
Sohyun An, Ruochen Wang, Tianyi Zhou et al.
Hyperbolic Fine-Tuning for Large Language Models
Menglin Yang, Ram Samarth B B, Aosong Feng et al.
Breaking Latent Prior Bias in Detectors for Generalizable AIGC Image Detection
Yue Zhou, Xinan He, Kaiqing Lin et al.
Optimizing Chain-of-Thought Reasoners via Gradient Variance Minimization in Rejection Sampling and RL
Jiarui Yao, Yifan Hao, Hanning Zhang et al.
Preference Optimization on Pareto Sets: On a Theory of Multi-Objective Optimization
Abhishek Roy, Geelon So, Yian Ma
This Time is Different: An Observability Perspective on Time Series Foundation Models
Ben Cohen, Emaad Khwaja, Youssef Doubli et al.
Solver-Informed RL: Grounding Large Language Models for Authentic Optimization Modeling
Yitian Chen, Jingfan Xia, Siyu Shao et al.
BadVLA: Towards Backdoor Attacks on Vision-Language-Action Models via Objective-Decoupled Optimization
Xueyang Zhou, Guiyao Tie, Guowen Zhang et al.
DenseDPO: Fine-Grained Temporal Preference Optimization for Video Diffusion Models
Ziyi Wu, Anil Kag, Ivan Skorokhodov et al.
Think or Not? Exploring Thinking Efficiency in Large Reasoning Models via an Information-Theoretic Lens
Xixian Yong, Xiao Zhou, Yingying Zhang et al.
RealMath: A Continuous Benchmark for Evaluating Language Models on Research-Level Mathematics
Jie Zhang, Cezara Petrui, Kristina Nikolić et al.
Why and How LLMs Hallucinate: Connecting the Dots with Subsequence Associations
Yiyou Sun, Yu Gai, Lijie Chen et al.
Rewind-to-Delete: Certified Machine Unlearning for Nonconvex Functions
Siqiao Mu, Diego Klabjan
Bayesian Concept Bottleneck Models with LLM Priors
Jean Feng, Avni Kothari, Lucas Zier et al.
Gemstones: A Model Suite for Multi-Faceted Scaling Laws
Sean McLeish, John Kirchenbauer, David Miller et al.
The ML.ENERGY Benchmark: Toward Automated Inference Energy Measurement and Optimization
Jae-Won Chung, Jeff J. Ma, Ruofan Wu et al.
Harmony in Divergence: Towards Fast, Accurate, and Memory-efficient Zeroth-order LLM Fine-tuning
Qitao Tan, Jun Liu, Zheng Zhan et al.
DRIFT: Dynamic Rule-Based Defense with Injection Isolation for Securing LLM Agents
Hao Li, Xiaogeng Liu, CHIU Chun et al.
MIP against Agent: Malicious Image Patches Hijacking Multimodal OS Agents
Lukas Aichberger, Alasdair Paren, Guohao Li et al.
VideoUFO: A Million-Scale User-Focused Dataset for Text-to-Video Generation
Wenhao Wang, Yi Yang
The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text
Nikhil Kandpal, Brian Lester, Colin Raffel et al.
Exploring the limits of strong membership inference attacks on large language models
Jamie Hayes, I Shumailov, Christopher A. Choquette-Choo et al.
Data-Driven Performance Guarantees for Classical and Learned Optimizers
Rajiv Sambharya, Bartolomeo Stellato
Unleashing Hour-Scale Video Training for Long Video-Language Understanding
Jingyang Lin, Jialian Wu, Ximeng Sun et al.
ConfTuner: Training Large Language Models to Express Their Confidence Verbally
Yibo Li, Miao Xiong, Jiaying Wu et al.