Most Cited 2025 "semantic proximity" Papers
22,274 papers found • Page 27 of 112
Conference
Diff3DS: Generating View-Consistent 3D Sketch via Differentiable Curve Rendering
Yibo Zhang, Lihong Wang, Changqing Zou et al.
REGENT: A Retrieval-Augmented Generalist Agent That Can Act In-Context in New Environments
Kaustubh Sridhar, Souradeep Dutta, Dinesh Jayaraman et al.
GReaTer: Gradients Over Reasoning Makes Smaller Language Models Strong Prompt Optimizers
Sarkar Snigdha Sarathi Das, Ryo Kamoi, Bo Pang et al.
Simulation-Free Hierarchical Latent Policy Planning for Proactive Dialogues
Tao He, Lizi Liao, Yixin Cao et al.
The Elicitation Game: Evaluating Capability Elicitation Techniques
Felix Hofstätter, Teun van der Weij, Jayden Teoh et al.
Fast Think-on-Graph: Wider, Deeper and Faster Reasoning of Large Language Model on Knowledge Graph
Xujian Liang, Zhaoquan Gu
Near, far: Patch-ordering enhances vision foundation models' scene understanding
Valentinos Pariza, Mohammadreza Salehi, Gertjan J Burghouts et al.
CP-Guard: Malicious Agent Detection and Defense in Collaborative Bird’s Eye View Perception
Senkang Hu, Yihang Tao, Guowen Xu et al.
Beyond Sequence: Impact of Geometric Context for RNA Property Prediction
Junjie Xu, Artem Moskalev, Tommaso Mansi et al.
Understanding and Improving Length Generalization in Recurrent Models
Ricardo Buitrago Ruiz, Albert Gu
(Almost Full) EFX for Three (and More) Types of Agents
Pratik Ghosal, Vishwa Prakash HV, Prajakta Nimbhorkar et al.
Constrained Fair and Efficient Allocations
Benjamin Cookson, Soroush Ebadian, Nisarg Shah
Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models
Zhengfeng Lai, Vasileios Saveris, Chen Chen et al.
FloNa: Floor Plan Guided Embodied Visual Navigation
Jiaxin Li, Weiqi Huang, Zan Wang et al.
Counterfactual Generative Modeling with Variational Causal Inference
Yulun Wu, Louis McConnell, Claudia Iriondo
KPL: Training-Free Medical Knowledge Mining of Vision-Language Models
Jiaxiang Liu, Tianxiang Hu, Jiawei Du et al.
TimeCHEAT: A Channel Harmony Strategy for Irregularly Sampled Multivariate Time Series Analysis
Jiexi Liu, Meng Cao, Songcan Chen
Can a MISL Fly? Analysis and Ingredients for Mutual Information Skill Learning
Chongyi Zheng, Jens Tuyls, Joanne Peng et al.
Learning Generalizable Skills from Offline Multi-Task Data for Multi-Agent Cooperation
Sicong Liu, Yang Shu, Chenjuan Guo et al.
Explain Yourself, Briefly! Self-Explaining Neural Networks with Concise Sufficient Reasons
Shahaf Bassan, Ron Eliav, Shlomit Gur
Semi-Supervised Multi-View Multi-Label Learning with View-Specific Transformer and Enhanced Pseudo-Label
Quanjiang Li, Tingjin Luo, Mingdie Jiang et al.
Improved Finite-Particle Convergence Rates for Stein Variational Gradient Descent
Sayan Banerjee, Krishna Balasubramanian, PROMIT GHOSAL
HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation
Ling Yang, Xinchen Zhang, Ye Tian et al.
Highly Compressed Tokenizer Can Generate Without Training
Lukas Lao Beyer, Tianhong Li, Xinlei Chen et al.
Multi-Granular Multimodal Clue Fusion for Meme Understanding
Li Zheng, Hao Fei, Ting Dai et al.
Assessing the Creativity of LLMs in Proposing Novel Solutions to Mathematical Problems
Junyi Ye, Jingyi Gu, Xinyun Zhao et al.
One Token Can Help! Learning Scalable and Pluggable Virtual Tokens for Retrieval-Augmented Large Language Models
Yutao Zhu, Zhaoheng Huang, Zhicheng Dou et al.
Intermediate Layer Classifiers for OOD generalization
Arnas Uselis, Seong Joon Oh
LLM+AL: Bridging Large Language Models and Action Languages for Complex Reasoning About Actions
Adam Ishay, Joohyung Lee
TabFlex: Scaling Tabular Learning to Millions with Linear Attention
Yuchen Zeng, Tuan Dinh, Wonjun Kang et al.
MA-RLHF: Reinforcement Learning from Human Feedback with Macro Actions
Yekun Chai, Haoran Sun, Huang Fang et al.
Loss Functions and Operators Generated by f-Divergences
Vincent Roulet, Tianlin Liu, Nino Vieillard et al.
Extractive Structures Learned in Pretraining Enable Generalization on Finetuned Facts
Jiahai Feng, Stuart Russell, Jacob Steinhardt
GVMGen: A General Video-to-Music Generation Model with Hierarchical Attentions
Heda Zuo, Weitao You, Junxian Wu et al.
MAGE: Model-Level Graph Neural Networks Explanations via Motif-based Graph Generation
Zhaoning Yu, Hongyang Gao
Neural ODE Transformers: Analyzing Internal Dynamics and Adaptive Fine-tuning
Anh Tong, Thanh Nguyen-Tang, Dongeun Lee et al.
Realistic Evaluation of Deep Partial-Label Learning Algorithms
Wei Wang, Dong-Dong Wu, Jindong Wang et al.
$\gamma-$MoD: Exploring Mixture-of-Depth Adaptation for Multimodal Large Language Models
Yaxin Luo, Gen Luo, Jiayi Ji et al.
Broken Tokens? Your Language Model can Secretly Handle Non-Canonical Tokenizations
Brian Zheng, Alisa Liu, Orevaoghene Ahia et al.
ADIFF: Explaining audio difference using natural language
Soham Deshmukh, Shuo Han, Rita Singh et al.
Grounding Language with Vision: A Conditional Mutual Information Calibrated Decoding Strategy for Reducing Hallucinations in LVLMs
Hao Fang, Changle Zhou, Jiawei Kong et al.
QuaDiM: A Conditional Diffusion Model For Quantum State Property Estimation
Yehui Tang, Mabiao Long, Junchi Yan
A Sharper Global Convergence Analysis for Average Reward Reinforcement Learning via an Actor-Critic Approach
Swetha Ganesh, Washim Mondal, Vaneet Aggarwal
A Generalist Intracortical Motor Decoder
Joel Ye, Fabio Rizzoglio, Xuan Ma et al.
From Specificity to Generality: Revisiting Generalizable Artifacts in Detecting Face Deepfakes
Long Ma, Zhiyuan Yan, Jin Xu et al.
Thinking Racial Bias in Fair Forgery Detection: Models, Datasets and Evaluations
Decheng Liu, Zongqi Wang, Chunlei Peng et al.
MoRe: Class Patch Attention Needs Regularization for Weakly Supervised Semantic Segmentation
Zhiwei Yang, Yucong Meng, Kexue Fu et al.
Stochastic Forward–Backward Deconvolution: Training Diffusion Models with Finite Noisy Datasets
Haoye Lu, Qifan Wu, Yaoliang Yu
Synthesizing Privacy-Preserving Text Data via Finetuning *without* Finetuning Billion-Scale LLMs
Bowen Tan, Zheng Xu, Eric Xing et al.
Knowing Your Target: Target-Aware Transformer Makes Better Spatio-Temporal Video Grounding
Xin Gu, Yaojie Shen, Chenxi Luo et al.
Dialogue Without Limits: Constant-Sized KV Caches for Extended Response in LLMs
Ravi Ghadia, Avinash Kumar, Gaurav Jain et al.
Adversarial Generative Flow Network for Solving Vehicle Routing Problems
Ni Zhang, Jingfeng Yang, Zhiguang Cao et al.
MSE-Adapter: A Lightweight Plugin Endowing LLMs with the Capability to Perform Multimodal Sentiment Analysis and Emotion Recognition
Yang Yang, Xunde Dong, Yupeng Qiang
MUSE: Mamba Is Efficient Multi-scale Learner for Text-video Retrieval
Haoran Tang, Meng Cao, Jinfa Huang et al.
Promptable Anomaly Segmentation with SAM Through Self-Perception Tuning
Hui-Yue Yang, Hui Chen, Ao Wang et al.
DriveEditor: A Unified 3D Information-Guided Framework for Controllable Object Editing in Driving Scenes
Yiyuan Liang, Zhiying Yan, Liqun Chen et al.
Everything Everywhere All at Once: LLMs can In-Context Learn Multiple Tasks in Superposition
Zheyang Xiong, Jack Cai, John Cooper et al.
Be More Diverse than the Most Diverse: Optimal Mixtures of Generative Models via Mixture-UCB Bandit Algorithms
Parham Rezaei, Farzan Farnia, Cheuk Ting Li
Deep Linear Network Training Dynamics from Random Initialization: Data, Width, Depth, and Hyperparameter Transfer
Blake Bordelon, Cengiz Pehlevan
Can Textual Gradient Work in Federated Learning?
Minghui Chen, Ruinan Jin, Wenlong Deng et al.
Value-Based Deep RL Scales Predictably
Oleh Rybkin, Michal Nauman, Preston Fu et al.
VLScene: Vision-Language Guidance Distillation for Camera-Based 3D Semantic Scene Completion
Meng Wang, Huilong Pi, Ruihui Li et al.
A Training-Free Sub-quadratic Cost Transformer Model Serving Framework with Hierarchically Pruned Attention
Heejun Lee, Geon Park, Youngwan Lee et al.
Feature Denoising Diffusion Model for Blind Image Quality Assessment
Xudong Li, Yan Zhang, Yunhang Shen et al.
Chaos Meets Attention: Transformers for Large-Scale Dynamical Prediction
Yi He, Yiming Yang, Xiaoyuan Cheng et al.
ND-SDF: Learning Normal Deflection Fields for High-Fidelity Indoor Reconstruction
Ziyu Tang, Weicai Ye, Yifan Wang et al.
LODGE: Level-of-Detail Large-Scale Gaussian Splatting with Efficient Rendering
Jonas Kulhanek, Marie-Julie Rakotosaona, Fabian Manhardt et al.
Do Large Language Models Truly Understand Geometric Structures?
Xiaofeng Wang, Yiming Wang, Wenhong Zhu et al.
Synthetic Tabular Data Generation for Imbalanced Classification: The Surprising Effectiveness of an Overlap Class
Annie D'souza, Swetha M, Sunita Sarawagi
Lumina-T2X: Scalable Flow-based Large Diffusion Transformer for Flexible Resolution Generation
Gao Peng, Le Zhuo, Dongyang Liu et al.
Gumbel Counterfactual Generation From Language Models
Shauli Ravfogel, Anej Svete, Vésteinn Snæbjarnarson et al.
An Interpretable N-gram Perplexity Threat Model for Large Language Model Jailbreaks
Valentyn Boreiko, Alexander Panfilov, Václav Voráček et al.
ChangeDiff: A Multi-Temporal Change Detection Data Generator with Flexible Text Prompts via Diffusion Model
Qi Zang, Jiayi Yang, Shuang Wang et al.
C2F-TP: A Coarse-to-Fine Denoising Framework for Uncertainty-Aware Trajectory Prediction
Zichen Wang, Hao Miao, Senzhang Wang et al.
Multi-Modal and Multi-Attribute Generation of Single Cells with CFGen
Alessandro Palma, Till Richter, Hanyi Zhang et al.
Fast Training of Sinusoidal Neural Fields via Scaling Initialization
Taesun Yeom, Sangyoon Lee, Jaeho Lee
TEncDM: Understanding the Properties of the Diffusion Model in the Space of Language Model Encodings
Alexander Shabalin, Viacheslav Meshchaninov, Egor Chimbulatov et al.
Sortformer: A Novel Approach for Permutation-Resolved Speaker Supervision in Speech-to-Text Systems
Taejin Park, Ivan Medennikov, Kunal Dhawan et al.
HieraFashDiff: Hierarchical Fashion Design with Multi-stage Diffusion Models
Zhifeng Xie, Hao Li, Huiming Ding et al.
Leveraging Large Vision-Language Model as User Intent-Aware Encoder for Composed Image Retrieval
Zelong Sun, Dong Jing, Guoxing Yang et al.
Everything is Editable: Extend Knowledge Editing to Unstructured Data in Large Language Models
Jingcheng Deng, Zihao Wei, Liang Pang et al.
Does learning the right latent variables necessarily improve in-context learning?
Sarthak Mittal, Eric Elmoznino, Léo Gagnon et al.
Open CaptchaWorld: A Comprehensive Web-based Platform for Testing and Benchmarking Multimodal LLM Agents
Yaxin Luo, Zhaoyi Li, Jiacheng Liu et al.
Temporal Difference Flows
Jesse Farebrother, Matteo Pirotta, Andrea Tirinzoni et al.
CAPrompt: Cyclic Prompt Aggregation for Pre-Trained Model Based Class Incremental Learning
Qiwei Li, Jiahuan Zhou
STAR: Synthesis of Tailored Architectures
Armin Thomas, Rom Parnichkun, Alexander Amini et al.
Accelerated Over-Relaxation Heavy-Ball Method: Achieving Global Accelerated Convergence with Broad Generalization
Jingrong Wei, Long Chen
Incomplete Multi-view Deep Clustering with Data Imputation and Alignment
Jiyuan Liu, Xinwang Liu, Xinhang Wan et al.
DenseGrounding: Improving Dense Language-Vision Semantics for Ego-centric 3D Visual Grounding
Henry Zheng, Hao Shi, Qihang Peng et al.
Enhancing Large Language Model Performance with Gradient-Based Parameter Selection
Haoling Li, Xin Zhang, Xiao Liu et al.
ProtComposer: Compositional Protein Structure Generation with 3D Ellipsoids
Hannes Stärk, Bowen Jing, Tomas Geffner et al.
TIGeR: Unifying Text-to-Image Generation and Retrieval with Large Multimodal Models
Leigang Qu, Haochuan Li, Tan Wang et al.
DataMan: Data Manager for Pre-training Large Language Models
Ru Peng, Kexin Yang, Yawen Zeng et al.
A Reductions Approach to Risk-Sensitive Reinforcement Learning with Optimized Certainty Equivalents
Kaiwen Wang, Dawen Liang, Nathan Kallus et al.
Differentially Private Steering for Large Language Model Alignment
Anmol Goel, Yaxi Hu, Iryna Gurevych et al.
Unlocking the Potential of Reverse Distillation for Anomaly Detection
Xinyue Liu, Jianyuan Wang, Biao Leng et al.
Disentangling and Integrating Relational and Sensory Information in Transformer Architectures
Awni Altabaa, John Lafferty
Expensive Multi-Objective Bayesian Optimization Based on Diffusion Models
Bingdong Li, Zixiang Di, Yongfan Lu et al.
Elucidating the Design Space of Multimodal Protein Language Models
Cheng-Yen Hsieh, Xinyou Wang, Daiheng Zhang et al.
AGAV-Rater: Adapting Large Multimodal Model for AI-Generated Audio-Visual Quality Assessment
Yuqin Cao, Xiongkuo Min, Yixuan Gao et al.
HC-LLM: Historical-Constrained Large Language Models for Radiology Report Generation
Tengfei Liu, Jiapu Wang, Yongli Hu et al.
Enhancing Federated Domain Adaptation with Multi-Domain Prototype-Based Federated Fine-Tuning
Jingyuan Zhang, Yiyang Duan, Shuaicheng Niu et al.
Multi-Marginal Stochastic Flow Matching for High-Dimensional Snapshot Data at Irregular Time Points
Justin Lee, Behnaz Moradi-Jamei, Heman Shakeri
Grokking in the Wild: Data Augmentation for Real-World Multi-Hop Reasoning with Transformers
Roman Abramov, Felix Steinbauer, Gjergji Kasneci
Bayesian Optimization of Antibodies Informed by a Generative Model of Evolving Sequences
Alan Amin, Nate Gruver, Yilun Kuang et al.
Conformal Prediction Sets Can Cause Disparate Impact
Jesse Cresswell, Bhargava Kumar, Yi Sui et al.
Scaling Offline Model-Based RL via Jointly-Optimized World-Action Model Pretraining
Jie Cheng, Ruixi Qiao, ma yingwei et al.
Embedding Safety into RL: A New Take on Trust Region Methods
Nikola Milosevic, Johannes Müller, Nico Scherf
MDSGen: Fast and Efficient Masked Diffusion Temporal-Aware Transformers for Open-Domain Sound Generation
Trung X. Pham, Tri Ton, Chang Yoo
Fast and Slow Streams for Online Time Series Forecasting Without Information Leakage
Ying-yee Ava Lau, Zhiwen Shao, Dit-Yan Yeung
VQTalker: Towards Multilingual Talking Avatars Through Facial Motion Tokenization
Tao Liu, Ziyang Ma, Qi Chen et al.
Proactive Privacy Amnesia for Large Language Models: Safeguarding PII with Negligible Impact on Model Utility
Martin Kuo, Jingyang Zhang, Jianyi Zhang et al.
Data Taggants: Dataset Ownership Verification Via Harmless Targeted Data Poisoning
Wassim Bouaziz, Nicolas Usunier, El-Mahdi El-Mhamdi
FaceMe: Robust Blind Face Restoration with Personal Identification
Siyu Liu, Zheng-Peng Duan, Jia OuYang et al.
Does Training with Synthetic Data Truly Protect Privacy?
Yunpeng Zhao, Jie Zhang
Modality-Specialized Synergizers for Interleaved Vision-Language Generalists
Zhiyang Xu, Minqian Liu, Ying Shen et al.
Approaching Rate-Distortion Limits in Neural Compression with Lattice Transform Coding
Eric Lei, Hamed Hassani, Shirin Saeedi Bidokhti
GravMAD: Grounded Spatial Value Maps Guided Action Diffusion for Generalized 3D Manipulation
Yangtao Chen, Zixuan Chen, Junhui Yin et al.
Unlocking the Power of Function Vectors for Characterizing and Mitigating Catastrophic Forgetting in Continual Instruction Tuning
Gangwei Jiang, caigao jiang, Zhaoyi Li et al.
Robustness of Quantum Algorithms for Nonconvex Optimization
Weiyuan Gong, Chenyi Zhang, Tongyang Li
Accurate Link Prediction for Edge-Incomplete Graphs via PU Learning
Junghun Kim, Ka Hyun Park, Hoyoung Yoon et al.
Efficiently Serving Large Multimodal Models Using EPD Disaggregation
Gursimran Singh, Xinglu Wang, Yifan Hu et al.
Efficient Reward Poisoning Attacks on Online Deep Reinforcement Learning
Yinglun Xu, Qi Zeng, Gagandeep Singh
A Theoretical Perspective: How to Prevent Model Collapse in Self-consuming Training Loops
Shi Fu, Yingjie Wang, Yuzhu Chen et al.
LIFe-GoM: Generalizable Human Rendering with Learned Iterative Feedback Over Multi-Resolution Gaussians-on-Mesh
Jing Wen, Alex Schwing, Shenlong Wang
Visual Attention Never Fades: Selective Progressive Attention ReCalibration for Detailed Image Captioning in Multimodal Large Language Models
Mingi Jung, Saehyung Lee, Eunji Kim et al.
Anywhere: A Multi-Agent Framework for User-Guided, Reliable, and Diverse Foreground-Conditioned Image Generation
Xie Tianyidan, Rui Ma, Qian Wang et al.
Free Hunch: Denoiser Covariance Estimation for Diffusion Models Without Extra Costs
Severi Rissanen, Markus Heinonen, Arno Solin
Looking Backward: Retrospective Backward Synthesis for Goal-Conditioned GFlowNets
Haoran He, Can Chang, Huazhe Xu et al.
Injecting Universal Jailbreak Backdoors into LLMs in Minutes
Zhuowei Chen, qiannan zhang, Shichao Pei
Same Task, Different Circuits: Disentangling Modality-Specific Mechanisms in VLMs
Yaniv Nikankin, Dana Arad, Yossi Gandelsman et al.
Growth Inhibitors for Suppressing Inappropriate Image Concepts in Diffusion Models
Die Chen, Zhiwen Li, Mingyuan Fan et al.
From Kernels to Features: A Multi-Scale Adaptive Theory of Feature Learning
Noa Rubin, Kirsten Fischer, Javed Lindner et al.
Accessing Vision Foundation Models via ImageNet-1K
Yitian Zhang, Xu Ma, Yue Bai et al.
The Canary’s Echo: Auditing Privacy Risks of LLM-Generated Synthetic Text
Matthieu Meeus, Lukas Wutschitz, Santiago Zanella-Beguelin et al.
Bayesian Optimization via Continual Variational Last Layer Training
Paul Brunzema, Mikkel Jordahn, John Willes et al.
Episodic Novelty Through Temporal Distance
Yuhua Jiang, Qihan Liu, Yiqin Yang et al.
A Two-Stage Learning-to-Defer Approach for Multi-Task Learning
Yannis Montreuil, Shu Heng Yeo, Axel Carlier et al.
Knowledge in Superposition: Unveiling the Failures of Lifelong Knowledge Editing for Large Language Models
Chenhui Hu, Pengfei Cao, Yubo Chen et al.
Towards Bridging Generalization and Expressivity of Graph Neural Networks
Shouheng Li, Floris Geerts, Dongwoo Kim et al.
CaPo: Cooperative Plan Optimization for Efficient Embodied Multi-Agent Cooperation
Jie Liu, Pan Zhou, Yingjun Du et al.
Learning Evolving Tools for Large Language Models
Guoxin Chen, Zhong Zhang, Xin Cong et al.
Fair Submodular Cover
Wenjing Chen, Shuo Xing, Samson Zhou et al.
Preference Diffusion for Recommendation
Shuo Liu, An Zhang, Guoqing Hu et al.
Incomplete Modality Disentangled Representation for Ophthalmic Disease Grading and Diagnosis
Chengzhi Liu, Zile Huang, Zhe Chen et al.
Principled Algorithms for Optimizing Generalized Metrics in Binary Classification
Anqi Mao, Mehryar Mohri, Yutao Zhong
Enhancing Rating-Based Reinforcement Learning to Effectively Leverage Feedback from Large Vision-Language Models
Minh-Tung Luu, Younghwan Lee, Donghoon Lee et al.
RobuRCDet: Enhancing Robustness of Radar-Camera Fusion in Bird's Eye View for 3D Object Detection
Jingtong Yue, Zhiwei Lin, Xin Lin et al.
Better than Your Teacher: LLM Agents that learn from Privileged AI Feedback
Sanjiban Choudhury, Paloma Sodhi
Towards Unifying Evaluation of Counterfactual Explanations: Leveraging Large Language Models for Human-Centric Assessments
Marharyta Domnich, Julius Välja, Rasmus Moorits Veski et al.
De-mark: Watermark Removal in Large Language Models
Ruibo Chen, Yihan Wu, Junfeng Guo et al.
Speculative Prefill: Turbocharging TTFT with Lightweight and Training-Free Token Importance Estimation
Jingyu Liu, Beidi Chen, Ce Zhang
Balancing the Scales: A Theoretical and Algorithmic Framework for Learning from Imbalanced Data
Corinna Cortes, Anqi Mao, Mehryar Mohri et al.
Micro-macro Wavelet-based Gaussian Splatting for 3D Reconstruction from Unconstrained Images
Yihui Li, Chengxin Lv, Hongyu Yang et al.
EgoPrivacy: What Your First-Person Camera Says About You?
Yijiang Li, Genpei Zhang, Jiacheng Cheng et al.
Offline-to-Online Hyperparameter Transfer for Stochastic Bandits
Dravyansh Sharma, Arun Suggala
Unsupervised Audio-Visual Segmentation with Modality Alignment
Swapnil Bhosale, Haosen Yang, Diptesh Kanojia et al.
Towards Understanding Why Label Smoothing Degrades Selective Classification and How to Fix It
Guoxuan Xia, Olivier Laurent, Gianni Franchi et al.
Synthio: Augmenting Small-Scale Audio Classification Datasets with Synthetic Data
Sreyan Ghosh, Sonal Kumar, Zhifeng Kong et al.
Meta-Black-Box-Optimization through Offline Q-function Learning
Zeyuan Ma, Zhiguang Cao, Zhou Jiang et al.
Building a Multi-modal Spatiotemporal Expert for Zero-shot Action Recognition with CLIP
Yating Yu, Congqi Cao, Yueran Zhang et al.
MergeNet: Knowledge Migration Across Heterogeneous Models, Tasks, and Modalities
Kunxi Li, Tianyu Zhan, Kairui Fu et al.
OpenViewer: Openness-Aware Multi-View Learning
Shide Du, Zihan Fang, Yanchao Tan et al.
World Knowledge-Enhanced Reasoning Using Instruction-Guided Interactor in Autonomous Driving
Mingliang Zhai, Cheng Li, Zengyuan Guo et al.
Understanding High-Dimensional Bayesian Optimization
Leonard Papenmeier, Matthias Poloczek, Luigi Nardi
Decoding Game: On Minimax Optimality of Heuristic Text Generation Strategies
Sijin Chen, Omar Hagrass, Jason Klusowski
Compositional simulation-based inference for time series
Manuel Gloeckler, Shoji Toyota, Kenji Fukumizu et al.
Outsourced Diffusion Sampling: Efficient Posterior Inference in Latent Spaces of Generative Models
Siddarth Venkatraman, Mohsin Hasan, Minsu Kim et al.
Efficient stagewise pretraining via progressive subnetworks
Abhishek Panigrahi, Nikunj Saunshi, Kaifeng Lyu et al.
GraphCL: Graph-based Clustering for Semi-Supervised Medical Image Segmentation
Mengzhu Wang, houcheng su, Jiao Li et al.
Stable Mean Teacher for Semi-supervised Video Action Detection
Akash Kumar, Sirshapan Mitra, Yogesh Singh Rawat
The Belief State Transformer
Edward Hu, Kwangjun Ahn, Qinghua Liu et al.
Do We Need to Verify Step by Step? Rethinking Process Supervision from a Theoretical Perspective
Zeyu Jia, Alexander Rakhlin, Tengyang Xie
Fine-Tuning Attention Modules Only: Enhancing Weight Disentanglement in Task Arithmetic
Ruochen Jin, Bojian Hou, Jiancong Xiao et al.
Topograph: An Efficient Graph-Based Framework for Strictly Topology Preserving Image Segmentation
Laurin Lux, Alexander H Berger, Alexander Weers et al.
ROUTE: Robust Multitask Tuning and Collaboration for Text-to-SQL
Yang Qin, Chao Chen, Zhihang Fu et al.
Hierarchical Alignment-enhanced Adaptive Grounding Network for Generalized Referring Expression Comprehension
Yaxian Wang, Henghui Ding, Shuting He et al.
SynQ: Accurate Zero-shot Quantization by Synthesis-aware Fine-tuning
Minjun Kim, Jongjin Kim, U Kang
Rethinking Pseudo-Label Guided Learning for Weakly Supervised Temporal Action Localization from the Perspective of Noise Correction
Quan Zhang, Yuxin Qi, Xi Tang et al.
Video Action Differencing
James Burgess, Xiaohan Wang, Yuhui Zhang et al.
CapeX: Category-Agnostic Pose Estimation from Textual Point Explanation
Matan Rusanovsky, Or Hirschorn, Shai Avidan
SplatFormer: Point Transformer for Robust 3D Gaussian Splatting
Yutong Chen, Marko Mihajlovic, Xiyi Chen et al.
DCBM: Data-Efficient Visual Concept Bottleneck Models
Katharina Prasse, Patrick Knab, Sascha Marton et al.
An All-Atom Generative Model for Designing Protein Complexes
Ruizhe Chen, Dongyu Xue, Xiangxin Zhou et al.
Manta: Enhancing Mamba for Few-Shot Action Recognition of Long Sub-Sequence
Wenbo Huang, Jinghui Zhang, Guang Li et al.
Gradient descent with generalized Newton’s method
Zhiqi Bu, Shiyun Xu
Glauber Generative Model: Discrete Diffusion Models via Binary Classification
Harshit Varma, Dheeraj Nagaraj, Karthikeyan Shanmugam
Leveraging Consistent Spatio-Temporal Correspondence for Robust Visual Odometry
Zhaoxing Zhang, Junda Cheng, Gangwei Xu et al.
Neural Context Flows for Meta-Learning of Dynamical Systems
Roussel Desmond Nzoyem, David Barton, Tom Deakin
From Attention to Activation: Unraveling the Enigmas of Large Language Models
Prannay Kaul, Chengcheng Ma, Ismail Elezi et al.
Evaluating LLM Reasoning in the Operations Research Domain with ORQA
Mahdi Mostajabdaveh, Timothy Tin Long Yu, Samarendra Chandan Bindu Dash et al.
Boosting Fine-Grained Visual Anomaly Detection with Coarse-Knowledge-Aware Adversarial Learning
Qingqing Fang, Qinliang Su, Wenxi Lv et al.
How many samples are needed to train a deep neural network?
Pegah Golestaneh, Mahsa Taheri, Johannes Lederer
Position: The Future of Bayesian Prediction Is Prior-Fitted
Samuel Gabriel Müller, Arik Reuter, Noah Hollmann et al.
A transfer learning framework for weak to strong generalization
Seamus Somerstep, Felipe Maia Polo, Moulinath Banerjee et al.
DuMo: Dual Encoder Modulation Network for Precise Concept Erasure
Feng Han, Kai Chen, Chao Gong et al.
Learning Chaos In A Linear Way
Xiaoyuan Cheng, Yi He, Yiming Yang et al.
Interactive Speculative Planning: Enhance Agent Efficiency through Co-design of System and User Interface
Wenyue Hua, Mengting Wan, JAGANNATH VADREVU et al.
VIoTGPT: Learning to Schedule Vision Tools Towards Intelligent Video Internet of Things
Yaoyao Zhong, Mengshi Qi, Rui Wang et al.
DISCO: learning to DISCover an evolution Operator for multi-physics-agnostic prediction
Rudy Morel, Jiequn Han, Edouard Oyallon