Most Cited 2025 "weight negation" Papers
22,274 papers found • Page 9 of 112
Conference
Feat2GS: Probing Visual Foundation Models with Gaussian Splatting
Yue Chen, Xingyu Chen, Anpei Chen et al.
CAD-Recode: Reverse Engineering CAD Code from Point Clouds
Danila Rukhovich, Elona Dupont, Dimitrios Mallis et al.
Interpretable Bilingual Multimodal Large Language Model for Diverse Biomedical Tasks
Lehan Wang, Haonan Wang, Honglong Yang et al.
CyberHost: A One-stage Diffusion Framework for Audio-driven Talking Body Generation
Gaojie Lin, Jianwen Jiang, Chao Liang et al.
AVHBench: A Cross-Modal Hallucination Benchmark for Audio-Visual Large Language Models
Kim Sung-Bin, Oh Hyun-Bin, Lee Jung-Mok et al.
Precise, Fast, and Low-cost Concept Erasure in Value Space: Orthogonal Complement Matters
Yuan Wang, Ouxiang Li, Tingting Mu et al.
A Quantum Circuit-Based Compression Perspective for Parameter-Efficient Learning
Chen-Yu Liu, Chao-Han Huck Yang, Hsi-Sheng Goan et al.
AlignMamba: Enhancing Multimodal Mamba with Local and Global Cross-modal Alignment
Yan Li, Yifei Xing, Xiangyuan Lan et al.
Grokking at the Edge of Numerical Stability
Lucas Prieto, Melih Barsbey, Pedro Mediano et al.
Accelerating neural network training: An analysis of the AlgoPerf competition
Priya Kasimbeg, Frank Schneider, Runa Eschenhagen et al.
USP: Unified Self-Supervised Pretraining for Image Generation and Understanding
Xiangxiang Chu, Renda Li, Yong Wang
From Risk to Uncertainty: Generating Predictive Uncertainty Measures via Bayesian Estimation
Nikita Kotelevskii, Vladimir Kondratyev, Martin Takáč et al.
No Preference Left Behind: Group Distributional Preference Optimization
Binwei Yao, Zefan Cai, Yun-Shiuan Chuang et al.
Design Principle Transfer in Neural Architecture Search via Large Language Models
Xun Zhou, Xingyu Wu, Liang Feng et al.
GEM: Empowering MLLM for Grounded ECG Understanding with Time Series and Images
Xiang Lan, Feng Wu, Kai He et al.
Causal Concept Graph Models: Beyond Causal Opacity in Deep Learning
Gabriele Dominici, Pietro Barbiero, Mateo Espinosa Zarlenga et al.
DarkBench: Benchmarking Dark Patterns in Large Language Models
Esben Kran, Hieu Minh Nguyen, Akash Kundu et al.
Mitigating the Human-Robot Domain Discrepancy in Visual Pre-training for Robotic Manipulation
Jiaming Zhou, Teli Ma, Kun-Yu Lin et al.
3DRealCar: An In-the-wild RGB-D Car Dataset with 360-degree Views
Xiaobiao Du, Yida Wang, Haiyang Sun et al.
MLLMs Need 3D-Aware Representation Supervision for Scene Understanding
Xiaohu Huang, Jingjing Wu, Qunyi Xie et al.
VITA-Audio: Fast Interleaved Audio-Text Token Generation for Efficient Large Speech-Language Model
Zuwei Long, Yunhang Shen, Chaoyou Fu et al.
DeLLMa: Decision Making Under Uncertainty with Large Language Models
Ollie Liu, Deqing Fu, Dani Yogatama et al.
Personalized Federated Collaborative Filtering: A Variational AutoEncoder Approach
Zhiwei Li, Guodong Long, Tianyi Zhou et al.
Detecting Out-of-Distribution Through the Lens of Neural Collapse
Litian Liu, Yao Qin
Optimal Transport for Time Series Imputation
Hao Wang, zhengnan li, Haoxuan Li et al.
Unleashing the Potential of Multi-modal Foundation Models and Video Diffusion for 4D Dynamic Physical Scene Simulation
Zhuoman Liu, Weicai Ye, Yan Luximon et al.
Beyond Chemical QA: Evaluating LLM's Chemical Reasoning with Modular Chemical Operations
Li Hao, He CAO, Bin Feng et al.
Reasoning of Large Language Models over Knowledge Graphs with Super-Relations
Song Wang, Junhong Lin, Xiaojie Guo et al.
Sloth: scaling laws for LLM skills to predict multi-benchmark performance across families
Felipe Maia Polo, Seamus Somerstep, Leshem Choshen et al.
RainbowPO: A Unified Framework for Combining Improvements in Preference Optimization
Hanyang Zhao, Genta Winata, Anirban Das et al.
ThinkSound: Chain-of-Thought Reasoning in Multimodal LLMs for Audio Generation and Editing
Huadai Liu, Kaicheng Luo, Jialei Wang et al.
Instant3dit: Multiview Inpainting for Fast Editing of 3D Objects
Amir Barda, Matheus Gadelha, Vladimir G. Kim et al.
Closed-Form Merging of Parameter-Efficient Modules for Federated Continual Learning
Riccardo Salami, Pietro Buzzega, Matteo Mosconi et al.
Roboflow100-VL: A Multi-Domain Object Detection Benchmark for Vision-Language Models
Matvei Popov, Peter Robicheaux, Anish Madan et al.
GenFusion: Closing the Loop between Reconstruction and Generation via Videos
Sibo Wu, Congrong Xu, Binbin Huang et al.
SensorLM: Learning the Language of Wearable Sensors
Yuwei Zhang, Kumar Ayush, Siyuan Qiao et al.
Around the World in 80 Timesteps: A Generative Approach to Global Visual Geolocation
Nicolas Dufour, Vicky Kalogeiton, David Picard et al.
Chat2SVG: Vector Graphics Generation with Large Language Models and Image Diffusion Models
Ronghuan Wu, Wanchao Su, Jing Liao
Stationary Kernels and Gaussian Processes on Lie Groups and their Homogeneous Spaces II: non-compact symmetric spaces
Iskander Azangulov, Andrei Smolensky, Alexander Terenin et al.
Track-On: Transformer-based Online Point Tracking with Memory
Görkay Aydemir, Xiongyi Cai, Weidi Xie et al.
Quamba: A Post-Training Quantization Recipe for Selective State Space Models
Hung-Yueh Chiang, Chi-Chih Chang, Natalia Frumkin et al.
EnvGS: Modeling View-Dependent Appearance with Environment Gaussian
Tao Xie, Xi Chen, Zhen Xu et al.
Scalable Influence and Fact Tracing for Large Language Model Pretraining
Tyler Chang, Dheeraj Rajagopal, Tolga Bolukbasi et al.
Aioli: A Unified Optimization Framework for Language Model Data Mixing
Mayee Chen, Michael Hu, Nicholas Lourie et al.
TechSinger: Technique Controllable Multilingual Singing Voice Synthesis via Flow Matching
Wenxiang Guo, Yu Zhang, Changhao Pan et al.
Leveraging Online Olympiad-Level Math Problems for LLMs Training and Contamination-Resistant Evaluation
Sadegh Mahdavi, Muchen Li, Kaiwen Liu et al.
TULIP: Token-length Upgraded CLIP
Ivona Najdenkoska, Mohammad Mahdi Derakhshani, Yuki Asano et al.
Accelerating Multimodal Large Language Models by Searching Optimal Vision Token Reduction
Shiyu Zhao, Zhenting Wang, Felix Juefei-Xu et al.
Scalable Image Tokenization with Index Backpropagation Quantization
Fengyuan Shi, Zhuoyan Luo, Yixiao Ge et al.
Revealing Key Details to See Differences: A Novel Prototypical Perspective for Skeleton-based Action Recognition
Hongda Liu, Yunfan Liu, Min Ren et al.
A Unified Comparative Study with Generalized Conformity Scores for Multi-Output Conformal Regression
Victor Dheur, Matteo Fontana, Yorick Estievenart et al.
Motion Prior Knowledge Learning with Homogeneous Language Descriptions for Moving Infrared Small Target Detection
Shengjia Chen, Luping Ji, Weiwei Duan et al.
Swift4D: Adaptive divide-and-conquer Gaussian Splatting for compact and efficient reconstruction of dynamic scene
Jiahao Wu, Rui Peng, Zhiyan Wang et al.
DashGaussian: Optimizing 3D Gaussian Splatting in 200 Seconds
Youyu Chen, Junjun Jiang, Kui Jiang et al.
Model Swarms: Collaborative Search to Adapt LLM Experts via Swarm Intelligence
Shangbin Feng, Zifeng Wang, Yike Wang et al.
MMAD: A Comprehensive Benchmark for Multimodal Large Language Models in Industrial Anomaly Detection
Xi Jiang, Jian Li, Hanqiu Deng et al.
Does SGD really happen in tiny subspaces?
Minhak Song, Kwangjun Ahn, Chulhee Yun
OATS: Outlier-Aware Pruning Through Sparse and Low Rank Decomposition
Stephen Zhang, Vardan Papyan
MarS: a Financial Market Simulation Engine Powered by Generative Foundation Model
Junjie Li, Yang Liu, Weiqing Liu et al.
MOS: Model Surgery for Pre-Trained Model-Based Class-Incremental Learning
Hai-Long Sun, Da-Wei Zhou, Hanbin Zhao et al.
BEST-Route: Adaptive LLM Routing with Test-Time Optimal Compute
Dujian Ding, Ankur Mallick, Shaokun Zhang et al.
Reinforce LLM Reasoning through Multi-Agent Reflection
Yurun Yuan, Tengyang Xie
MiniPLM: Knowledge Distillation for Pre-training Language Models
Yuxian Gu, Hao Zhou, Fandong Meng et al.
BioCLIP 2: Emergent Properties from Scaling Hierarchical Contrastive Learning
Jianyang Gu, Sam Stevens, Elizabeth Campolongo et al.
MLLM-as-a-Judge for Image Safety without Human Labeling
Zhenting Wang, Shuming Hu, Shiyu Zhao et al.
EgoExo-Gen: Ego-centric Video Prediction by Watching Exo-centric Videos
Jilan Xu, Yifei Huang, Baoqi Pei et al.
PrEditor3D: Fast and Precise 3D Shape Editing
Ziya Erkoc, Can Gümeli, Chaoyang Wang et al.
Exploring the Limits of Vision-Language-Action Manipulation in Cross-task Generalization
Jiaming Zhou, Ke Ye, Jiayi Liu et al.
xLSTM-Mixer: Multivariate Time Series Forecasting by Mixing via Scalar Memories
Maurice Kraus, Felix Divo, Devendra Singh Dhami et al.
CREMA: Generalizable and Efficient Video-Language Reasoning via Multimodal Modular Fusion
Shoubin Yu, Jaehong Yoon, Mohit Bansal
Adaptive Rectangular Convolution for Remote Sensing Pansharpening
Xueyang Wang, Zhixin Zheng, Jiandong Shao et al.
Tighter Privacy Auditing of DP-SGD in the Hidden State Threat Model
Tudor Cebere, Aurélien Bellet, Nicolas Papernot
Beyond Single Concept Vector: Modeling Concept Subspace in LLMs with Gaussian Distribution
Haiyan Zhao, Heng Zhao, Bo Shen et al.
Where am I? Cross-View Geo-localization with Natural Language Descriptions
Junyan Ye, Honglin Lin, Leyan Ou et al.
WebGen-Bench: Evaluating LLMs on Generating Interactive and Functional Websites from Scratch
Zimu Lu, Yunqiao Yang, Houxing Ren et al.
DreamOmni: Unified Image Generation and Editing
Bin Xia, Yuechen Zhang, Jingyao Li et al.
CXPMRG-Bench: Pre-training and Benchmarking for X-ray Medical Report Generation on CheXpert Plus Dataset
Xiao Wang, Fuling Wang, Yuehang Li et al.
MonoInstance: Enhancing Monocular Priors via Multi-view Instance Alignment for Neural Rendering and Reconstruction
Wenyuan Zhang, Yixiao Yang, Han Huang et al.
UniCombine: Unified Multi-Conditional Combination with Diffusion Transformer
Haoxuan Wang, Jinlong Peng, Qingdong He et al.
VeriThinker: Learning to Verify Makes Reasoning Model Efficient
Zigeng Chen, Xinyin Ma, Gongfan Fang et al.
MagicArticulate: Make Your 3D Models Articulation-Ready
Chaoyue Song, Jianfeng Zhang, Xiu Li et al.
Depth Any Camera: Zero-Shot Metric Depth Estimation from Any Camera
Yuliang Guo, Sparsh Garg, S. Mahdi H. Miangoleh et al.
MetaOOD: Automatic Selection of OOD Detection Models
Yuehan Qin, Yichi Zhang, Yi Nian et al.
Neighboring Autoregressive Modeling for Efficient Visual Generation
Yefei He, Yuanyu He, Shaoxuan He et al.
What Matters in Learning from Large-Scale Datasets for Robot Manipulation
Vaibhav Saxena, Matthew Bronars, Nadun Ranawaka Arachchige et al.
Training Neural Networks as Recognizers of Formal Languages
Alexandra Butoi, Ghazal Khalighinejad, Anej Svete et al.
DexVLG: Dexterous Vision-Language-Grasp Model at Scale
Jiawei He, Danshi Li, Xinqiang Yu et al.
Adaptive Methods through the Lens of SDEs: Theoretical Insights on the Role of Noise
Enea Monzio Compagnoni, Tianlin Liu, Rustem Islamov et al.
Memory Injection Attacks on LLM Agents via Query-Only Interaction
Shen Dong, Shaochen Xu, Pengfei He et al.
Generative Flows on Synthetic Pathway for Drug Design
Seonghwan Seo, Minsu Kim, Tony Shen et al.
UniRestore: Unified Perceptual and Task-Oriented Image Restoration Model Using Diffusion Prior
I-Hsiang Chen, Wei-Ting Chen, Yu-Wei Liu et al.
VQ-VLA: Improving Vision-Language-Action Models via Scaling Vector-Quantized Action Tokenizers
Yating Wang, Haoyi Zhu, Mingyu Liu et al.
Spiking Transformer with Spatial-Temporal Attention
Donghyun Lee, Yuhang Li, Youngeun Kim et al.
BlueSuffix: Reinforced Blue Teaming for Vision-Language Models Against Jailbreak Attacks
Yunhan Zhao, Xiang Zheng, Lin Luo et al.
A Many-Objective Problem Where Crossover Is Provably Indispensable
Andre Opris
MMedPO: Aligning Medical Vision-Language Models with Clinical-Aware Multimodal Preference Optimization
Kangyu Zhu, Peng Xia, Yun Li et al.
FastVID: Dynamic Density Pruning for Fast Video Large Language Models
Leqi Shen, Guoqiang Gong, Tao He et al.
PointOBB-v2: Towards Simpler, Faster, and Stronger Single Point Supervised Oriented Object Detection
Botao Ren, Xue Yang, Yi Yu et al.
ECBench: Can Multi-modal Foundation Models Understand the Egocentric World? A Holistic Embodied Cognition Benchmark
Ronghao Dang, Yuqian Yuan, Wenqi Zhang et al.
How Much is a Noisy Image Worth? Data Scaling Laws for Ambient Diffusion.
Giannis Daras, Yeshwanth Cherapanamjeri, Constantinos C Daskalakis
Adaptive Length Image Tokenization via Recurrent Allocation
Shivam Duggal, Phillip Isola, Antonio Torralba et al.
Structured Packing in LLM Training Improves Long Context Utilization
Konrad Staniszewski, Szymon Tworkowski, Sebastian Jaszczur et al.
Mixture of Noise for Pre-Trained Model-Based Class-Incremental Learning
Kai Jiang, Zhengyan Shi, Dell Zhang et al.
Baking Gaussian Splatting into Diffusion Denoiser for Fast and Scalable Single-stage Image-to-3D Generation and Reconstruction
Yuanhao Cai, He Zhang, Kai Zhang et al.
Learning Interpretable Hierarchical Dynamical Systems Models from Time Series Data
Manuel Brenner, Elias Weber, Georgia Koppe et al.
Large Images Are Gaussians: High-Quality Large Image Representation with Levels of 2D Gaussian Splatting
Lingting Zhu, Guying Lin, Jinnan Chen et al.
Tracing Representation Progression: Analyzing and Enhancing Layer-Wise Similarity
Jiachen Jiang, Jinxin Zhou, Zhihui Zhu
CBGBench: Fill in the Blank of Protein-Molecule Complex Binding Graph
Haitao Lin, Guojiang Zhao, Odin Zhang et al.
Efficient Part-level 3D Object Generation via Dual Volume Packing
Jiaxiang Tang, Ruijie Lu, Max Li et al.
Toward Generalized Image Quality Assessment: Relaxing the Perfect Reference Quality Assumption
Du CHEN, Tianhe Wu, Kede Ma et al.
Mimir: Improving Video Diffusion Models for Precise Text Understanding
Shuai Tan, Biao Gong, Yutong Feng et al.
APE: Faster and Longer Context-Augmented Generation via Adaptive Parallel Encoding
Xinyu Yang, Tianqi Chen, Beidi Chen
VidComposition: Can MLLMs Analyze Compositions in Compiled Videos?
Yunlong Tang, JunJia Guo, Hang Hua et al.
Controlling Large Language Models Through Concept Activation Vectors
Hanyu Zhang, Xiting Wang, Chengao Li et al.
MMDisCo: Multi-Modal Discriminator-Guided Cooperative Diffusion for Joint Audio and Video Generation
Akio Hayakawa, Masato Ishii, Takashi Shibuya et al.
Palu: KV-Cache Compression with Low-Rank Projection
Chi-Chih Chang, Wei-Cheng Lin, Chien-Yu Lin et al.
EmbodiedOcc: Embodied 3D Occupancy Prediction for Vision-based Online Scene Understanding
Yuqi Wu, Wenzhao Zheng, Sicheng Zuo et al.
NavMorph: A Self-Evolving World Model for Vision-and-Language Navigation in Continuous Environments
Xuan Yao, Junyu Gao, Changsheng Xu
Understanding and Enhancing the Transferability of Jailbreaking Attacks
Runqi Lin, Bo Han, Fengwang Li et al.
Degradation-Aware Feature Perturbation for All-in-One Image Restoration
Xiangpeng Tian, Xiangyu Liao, Xiao Liu et al.
Revisiting MAE Pre-training for 3D Medical Image Segmentation
Tassilo Wald, Constantin Ulrich, Stanislav Lukyanenko et al.
OBI-Bench: Can LMMs Aid in Study of Ancient Script on Oracle Bones?
Zijian Chen, tingzhu chen, Wenjun Zhang et al.
Global-Local Tree Search in VLMs for 3D Indoor Scene Generation
Wei Deng, Mengshi Qi, Huadong Ma
Learnable Expansion of Graph Operators for Multi-Modal Feature Fusion
Dexuan Ding, Lei Wang, Liyun Zhu et al.
ROS-SAM: High-Quality Interactive Segmentation for Remote Sensing Moving Object
Zhe Shan, Yang Liu, Lei Zhou et al.
EmoDubber: Towards High Quality and Emotion Controllable Movie Dubbing
Gaoxiang Cong, Jiadong Pan, Liang Li et al.
LLMs Can Plan Only If We Tell Them
Bilgehan Sel, Ruoxi Jia, Ming Jin
Model Equality Testing: Which Model is this API Serving?
Irena Gao, Percy Liang, Carlos Guestrin
ViLLa: Video Reasoning Segmentation with Large Language Model
rongkun Zheng, Lu Qi, Xi Chen et al.
Prior-guided Hierarchical Harmonization Network for Efficient Image Dehazing
Xiongfei Su, Siyuan Li, Yuning Cui et al.
Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation
Xinyu Yang, Yuwei An, Hongyi Liu et al.
The Same but Different: Structural Similarities and Differences in Multilingual Language Modeling
Ruochen Zhang, Qinan Yu, Matianyu Zang et al.
Spatiotemporal Skip Guidance for Enhanced Video Diffusion Sampling
Junha Hyung, Kinam Kim, Susung Hong et al.
Logical Consistency of Large Language Models in Fact-Checking
Bishwamittra Ghosh, Sarah Hasan, Naheed Anjum Arafat et al.
Power Lines: Scaling laws for weight decay and batch size in LLM pre-training
Shane Bergsma, Nolan Dey, Gurpreet Gosal et al.
Can In-context Learning Really Generalize to Out-of-distribution Tasks?
Qixun Wang, Yifei Wang, Xianghua Ying et al.
RelationAdapter: Learning and Transferring Visual Relation with Diffusion Transformers
Yan Gong, Yiren Song, Yicheng Li et al.
Multi-Layer Visual Feature Fusion in Multimodal LLMs: Methods, Analysis, and Best Practices
Junyan Lin, Haoran Chen, Yue Fan et al.
Hierarchical Context Pruning: Optimizing Real-World Code Completion with Repository-Level Pretrained Code LLMs
Lei Zhang, Yunshui Li, Jiaming Li et al.
BindGPT: A Scalable Framework for 3D Molecular Design via Language Modeling and Reinforcement Learning
Artem Zholus, Maksim Kuznetsov, Roman Schutski et al.
3D-AffordanceLLM: Harnessing Large Language Models for Open-Vocabulary Affordance Detection in 3D Worlds
Hengshuo Chu, Xiang Deng, Qi Lv et al.
FaithDiff: Unleashing Diffusion Priors for Faithful Image Super-resolution
Junyang Chen, Jinshan Pan, Jiangxin Dong
Horizon Reduction Makes RL Scalable
Seohong Park, Kevin Frans, Deepinder Mann et al.
Two-stream Beats One-stream: Asymmetric Siamese Network for Efficient Visual Tracking
Jiawen Zhu, Huayi Tang, Xin Chen et al.
Systematic Outliers in Large Language Models
Yongqi An, Xu Zhao, Tao Yu et al.
MobileUse: A Hierarchical Reflection-Driven GUI Agent for Autonomous Mobile Operation
Ning Li, Xiangmou Qu, Jiamu Zhou et al.
FOLDER: Accelerating Multi-Modal Large Language Models with Enhanced Performance
Haicheng Wang, Zhemeng Yu, Gabriele Spadaro et al.
FD2-Net: Frequency-Driven Feature Decomposition Network for Infrared-Visible Object Detection
Ke Li, Di Wang, Zhangyuan Hu et al.
GraphMoRE: Mitigating Topological Heterogeneity via Mixture of Riemannian Experts
Zihao Guo, Qingyun Sun, Haonan Yuan et al.
To Trust or Not to Trust? Enhancing Large Language Models' Situated Faithfulness to External Contexts
Yukun Huang, Sanxing Chen, Hongyi Cai et al.
A CLIP-Powered Framework for Robust and Generalizable Data Selection
Suorong Yang, Peng Ye, Wanli Ouyang et al.
Sylber: Syllabic Embedding Representation of Speech from Raw Audio
Cheol Jun Cho, Nicholas Lee, Akshat Gupta et al.
ET-SEED: EFFICIENT TRAJECTORY-LEVEL SE(3) EQUIVARIANT DIFFUSION POLICY
Chenrui Tie, Yue Chen, Ruihai Wu et al.
Dual Data Alignment Makes AI-Generated Image Detector Easier Generalizable
Ruoxin Chen, Junwei Xi, Zhiyuan Yan et al.
IDArb: Intrinsic Decomposition for Arbitrary Number of Input Views and Illuminations
Zhibing Li, Tong Wu, Jing Tan et al.
A Probabilistic Perspective on Unlearning and Alignment for Large Language Models
Yan Scholten, Stephan Günnemann, Leo Schwinn
SPAM: Spike-Aware Adam with Momentum Reset for Stable LLM Training
Tianjin Huang, Ziquan Zhu, Gaojie Jin et al.
DRoC: Elevating Large Language Models for Complex Vehicle Routing via Decomposed Retrieval of Constraints
Xia Jiang, Yaoxin Wu, Chenhao Zhang et al.
Spiking Vision Transformer with Saccadic Attention
Shuai Wang, Malu Zhang, Dehao Zhang et al.
Exploring Sparse MoE in GANs for Text-conditioned Image Synthesis
Jiapeng Zhu, Ceyuan Yang, Kecheng Zheng et al.
The Pitfalls of Memorization: When Memorization Hurts Generalization
Reza Bayat, Mohammad Pezeshki, Elvis Dohmatob et al.
Retrieval Augmented Time Series Forecasting
Sungwon Han, Seungeon Lee, MEEYOUNG CHA et al.
DiffMS: Diffusion Generation of Molecules Conditioned on Mass Spectra
Montgomery Bohde, Mrunali Manjrekar, Runzhong Wang et al.
SOLVE: Synergy of Language-Vision and End-to-End Networks for Autonomous Driving
Xuesong Chen, Linjiang Huang, Tao Ma et al.
Few-Shot Recognition via Stage-Wise Retrieval-Augmented Finetuning
Tian Liu, Huixin Zhang, Shubham Parashar et al.
FireEdit: Fine-grained Instruction-based Image Editing via Region-aware Vision Language Model
Jun Zhou, Jiahao Li, Zunnan Xu et al.
SAFE: Multitask Failure Detection for Vision-Language-Action Models
Qiao Gu, Yuanliang Ju, Shengxiang Sun et al.
RoboGround: Robotic Manipulation with Grounded Vision-Language Priors
Haifeng Huang, Xinyi Chen, Yilun Chen et al.
Security Attacks on LLM-based Code Completion Tools
Wen Cheng, Ke Sun, Xinyu Zhang et al.
Scaling Properties of Diffusion Models For Perceptual Tasks
Rahul Ravishankar, Zeeshan Patel, Jathushan Rajasegaran et al.
GotenNet: Rethinking Efficient 3D Equivariant Graph Neural Networks
Sarp Aykent, Tian Xia
SiReRAG: Indexing Similar and Related Information for Multihop Reasoning
Nan Zhang, Prafulla Kumar Choubey, Alexander Fabbri et al.
PuzzleFusion++: Auto-agglomerative 3D Fracture Assembly by Denoise and Verify
Zhengqing Wang, Jiacheng Chen, Yasutaka Furukawa
From Low Rank Gradient Subspace Stabilization to Low-Rank Weights: Observations, Theories, and Applications
Ajay Jaiswal, Yifan Wang, Lu Yin et al.
Any6D: Model-free 6D Pose Estimation of Novel Object
Taeyeop Lee, Bowen Wen, Minjun Kang et al.
Octopus: Alleviating Hallucination via Dynamic Contrastive Decoding
Wei Suo, Lijun Zhang, Mengyang Sun et al.
AGENTIF: Benchmarking Large Language Models Instruction Following Ability in Agentic Scenarios
Yunjia Qi, Hao Peng, Xiaozhi Wang et al.
Thermalizer: Stable autoregressive neural emulation of spatiotemporal chaos
Chris Pedersen, Laure Zanna, Joan Bruna
Reinforcement Learning Finetunes Small Subnetworks in Large Language Models
Sagnik Mukherjee, Lifan Yuan, Dilek Hakkani-Tur et al.
Restoring Calibration for Aligned Large Language Models: A Calibration-Aware Fine-Tuning Approach
Jiancong Xiao, Bojian Hou, Zhanliang Wang et al.
Black-Box Detection of Language Model Watermarks
Thibaud Gloaguen, Nikola Jovanović, Robin Staab et al.
DON’T STOP ME NOW: EMBEDDING BASED SCHEDULING FOR LLMS
Rana Shahout, Eran Malach, Chunwei Liu et al.
Multi-Domain Graph Foundation Models: Robust Knowledge Transfer via Topology Alignment
Shuo Wang, Bokui Wang, Zhixiang Shen et al.
The Scene Language: Representing Scenes with Programs, Words, and Embeddings
Yunzhi Zhang, Zizhang Li, Matt Zhou et al.
RoboScape: Physics-informed Embodied World Model
Yu Shang, Xin Zhang, Yinzhou Tang et al.
AutoRedTeamer: Autonomous Red Teaming with Lifelong Attack Integration
Andy Zhou, Kevin Wu, Francesco Pinto et al.
Signature Kernel Conditional Independence Tests in Causal Discovery for Stochastic Processes
Georg Manten, Cecilia Casolo, Emilio Ferrucci et al.
Towards Robust and Parameter-Efficient Knowledge Unlearning for LLMs
Sungmin Cha, Sungjun Cho, Dasol Hwang et al.
Adaptive teachers for amortized samplers
Minsu Kim, Sanghyeok Choi, Taeyoung Yun et al.
Boosting Generative Image Modeling via Joint Image-Feature Synthesis
Theodoros Kouzelis, Efstathios Karypidis, Ioannis Kakogeorgiou et al.
Training-Free Efficient Video Generation via Dynamic Token Carving
Yuechen Zhang, Jinbo Xing, bin xia et al.
UNSURE: self-supervised learning with Unknown Noise level and Stein's Unbiased Risk Estimate
Julián Tachella, Mike Davies, Laurent Jacques
Trust, But Verify: A Self-Verification Approach to Reinforcement Learning with Verifiable Rewards
Xiaoyuan Liu, Tian Liang, Zhiwei He et al.
RefactorBench: Evaluating Stateful Reasoning in Language Agents Through Code
Dhruv Gautam, Spandan Garg, Jinu Jang et al.
DINO-Foresight: Looking into the Future with DINO
Efstathios Karypidis, Ioannis Kakogeorgiou, Spyridon Gidaris et al.
JarvisIR: Elevating Autonomous Driving Perception with Intelligent Image Restoration
yunlong lin, Zixu Lin, Haoyu Chen et al.
Re-Thinking Inverse Graphics With Large Language Models
Haiwen Feng, Michael J Black, Weiyang Liu et al.
Stochastic Deep Restoration Priors for Imaging Inverse Problems
Yuyang Hu, Albert Peng, Weijie Gan et al.
IncEventGS: Pose-Free Gaussian Splatting from a Single Event Camera
Jian Huang, Chengrui Dong, Xuanhua Chen et al.
Discrete to Continuous: Generating Smooth Transition Poses from Sign Language Observations
Shengeng Tang, Jiayi He, Lechao Cheng et al.