Most Cited 2025 "frequency domain features" Papers
22,274 papers found • Page 25 of 112
Conference
Autonomous Goal Detection and Cessation in Reinforcement Learning: A Case Study on Source Term Estimation
Yiwei Shi, Muning Wen, Qi Zhang et al.
StressPrompt: Does Stress Impact Large Language Models and Human Performance Similarly?
Guobin Shen, Dongcheng Zhao, Aorigele Bao et al.
Spotting the Unexpected (STU): A 3D LiDAR Dataset for Anomaly Segmentation in Autonomous Driving
Alexey Nekrasov, Malcolm Burdorf, Stewart Worrall et al.
Phoneme-Level Feature Discrepancies: A Key to Detecting Sophisticated Speech Deepfakes
Kuiyuan Zhang, Zhongyun Hua, Rushi Lan et al.
SALOVA: Segment-Augmented Long Video Assistant for Targeted Retrieval and Routing in Long-Form Video Analysis
Junho Kim, Hyunjun Kim, Hosu Lee et al.
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining
Wenqi Zhang, Hang Zhang, Xin Li et al.
ReNeg: Learning Negative Embedding with Reward Guidance
Xiaomin Li, yixuan liu, Takashi Isobe et al.
ZETA: Leveraging $Z$-order Curves for Efficient Top-$k$ Attention
Qiuhao Zeng, Jierui Huang, Peng Lu et al.
Through the Dual-Prism: A Spectral Perspective on Graph Data Augmentation for Graph Classifications
Yutong Xia, Runpeng Yu, Yuxuan Liang et al.
Cluster Based Heterogeneous Federated Foundation Model Adaptation and Fine-Tuning
Xianda Wang, Yaqi Qiao, Duo Wu et al.
Graph-Based Cross-Domain Knowledge Distillation for Cross-Dataset Text-to-Image Person Retrieval
Bingjun Luo, Jinpeng Wang, Zewen Wang et al.
Bridging Molecular Graphs and Large Language Models
Runze Wang, Mingqi Yang, Yanming Shen
Data-adaptive Differentially Private Prompt Synthesis for In-Context Learning
Fengyu Gao, Ruida Zhou, Tianhao Wang et al.
Towards Learnable Anchor for Deep Multi-View Clustering
Bocheng Wang, Chusheng Zeng, Mulin Chen et al.
CLOC: Contrastive Learning for Ordinal Classification with Multi-Margin N-pair Loss
Dileepa Pitawela, Gustavo Carneiro, Hsiang-Ting Chen
A Thorough Comparison Between Independent Cascade and Susceptible-Infected-Recovered Models
Panfeng Liu, Guoliang Qiu, Biaoshuai Tao et al.
PSMGD: Periodic Stochastic Multi-Gradient Descent for Fast Multi-Objective Optimization
Mingjing Xu, Peizhong Ju, Jia Liu et al.
Learning Interpretable Queries for Explainable Image Classification with Information Pursuit
Stefan Kolek, Aditya Chattopadhyay, Kwan Ho Ryan Chan et al.
SAM-CP: Marrying SAM with Composable Prompts for Versatile Segmentation
Pengfei Chen, Lingxi Xie, xinyue huo et al.
HandOS: 3D Hand Reconstruction in One Stage
Xingyu Chen, Zhuheng Song, Xiaoke Jiang et al.
BUFFER-X: Towards Zero-Shot Point Cloud Registration in Diverse Scenes
Minkyun Seo, Hyungtae Lim, Kanghee Lee et al.
Multi-View Collaborative Learning Network for Speech Deepfake Detection
Kuiyuan Zhang, Zhongyun Hua, Rushi Lan et al.
DF-MIA: A Distribution-Free Membership Inference Attack on Fine-Tuned Large Language Models
Zhiheng Huang, Yannan Liu, Daojing He et al.
Accelerating Training with Neuron Interaction and Nowcasting Networks
Boris Knyazev, Abhinav Moudgil, Guillaume Lajoie et al.
InteractionMap: Improving Online Vectorized HDMap Construction with Interaction
Kuang Wu, Chuan Yang, Zhanbin Li
FactorGCL: A Hypergraph-Based Factor Model with Temporal Residual Contrastive Learning for Stock Returns Prediction
Yitong Duan, Weiran Wang, Jian Li
RoboSense: Large-scale Dataset and Benchmark for Egocentric Robot Perception and Navigation in Crowded and Unstructured Environments
Haisheng Su, Feixiang Song, CONG MA et al.
mmFAS: Multimodal Face Anti-Spoofing Using Multi-Level Alignment and Switch-Attention Fusion
Geng Chen, Wuyuan Xie, Di Lin et al.
LASeR: Learning to Adaptively Select Reward Models with Multi-Arm Bandits
Duy Nguyen, Archiki Prasad, Elias Stengel-Eskin et al.
GroundingSuite: Measuring Complex Multi-Granular Pixel Grounding
Rui Hu, Yuxuan Zhang, Lianghui Zhu et al.
EgoBridge: Domain Adaptation for Generalizable Imitation from Egocentric Human Data
Ryan Punamiya, Dhruv Patel, Patcharapong Aphiwetsa et al.
Neural Interactive Proofs
Lewis Hammond, Sam Adam-Day
Lightweight Predictive 3D Gaussian Splats
Junli Cao, Vidit Goel, Chaoyang Wang et al.
On Extending Direct Preference Optimization to Accommodate Ties
Jinghong Chen, Guangyu Yang, Weizhe Lin et al.
Strategic Classification With Externalities
Safwan Hossain, Evi Micha, Yiling Chen et al.
uniINF: Best-of-Both-Worlds Algorithm for Parameter-Free Heavy-Tailed MABs
Yu Chen, Jiatai Huang, Yan Dai et al.
Poison as Cure: Visual Noise for Mitigating Object Hallucinations in LVMs
Kejia Zhang, Keda TAO, Jiasheng Tang et al.
V-Stylist: Video Stylization via Collaboration and Reflection of MLLM Agents
Zhengrong Yue, Shaobin Zhuang, Kunchang Li et al.
NUDGE: Lightweight Non-Parametric Fine-Tuning of Embeddings for Retrieval
Sepanta Zeighami, Zac Wellmer, Aditya Parameswaran
DreamLayer: Simultaneous Multi-Layer Generation via Diffusion Model
Junjia Huang, Pengxiang Yan, Jinhang Cai et al.
WorldWeaver: Generating Long-Horizon Video Worlds via Rich Perception
Zhiheng Liu, Xueqing Deng, Shoufa Chen et al.
Binarized Neural Network for Multi-spectral Image Fusion
Junming Hou, Xiaoyu Chen, Ran Ran et al.
Multi-head Transformers Provably Learn Symbolic Multi-step Reasoning via Gradient Descent
Tong Yang, Yu Huang, Yingbin Liang et al.
How to Verify Any (Reasonable) Distribution Property: Computationally Sound Argument Systems for Distributions
Tal Herman, Guy Rothblum
Two Sparse Matrices are Better than One: Sparsifying Neural Networks with Double Sparse Factorization
Vladimir Boza, Vladimir Macko
Leveraging Attention to Effectively Compress Prompts for Long-Context LLMs
Yunlong Zhao, Haoran Wu, Bo Xu
FlexEvent: Towards Flexible Event-Frame Object Detection at Varying Operational Frequencies
Dongyue Lu, Lingdong Kong, Gim Hee Lee et al.
RaSA: Rank-Sharing Low-Rank Adaptation
Zhiwei He, Zhaopeng Tu, Xing Wang et al.
Severing Spurious Correlations with Data Pruning
Varun Mulchandani, Jung-Eun Kim
Fine-grained Spatiotemporal Grounding on Egocentric Videos
Shuo LIANG, Yiwu Zhong, Zi-Yuan Hu et al.
Matryoshka Pilot: Learning to Drive Black-Box LLMs with LLMs
ChangHao Li, Yuchen Zhuang, Rushi Qiang et al.
VeriThoughts: Enabling Automated Verilog Code Generation using Reasoning and Formal Verification
Patrick Yubeaton, Andre Nakkab, Weihua Xiao et al.
Latent Radiance Fields with 3D-aware 2D Representations
Chaoyi Zhou, Xi Liu, Feng Luo et al.
A Unified Solution to Video Fusion: From Multi-Frame Learning to Benchmarking
Zixiang Zhao, Haowen Bai, Bingxin Ke et al.
CoPRA: Bridging Cross-domain Pretrained Sequence Models with Complex Structures for Protein-RNA Binding Affinity Prediction
Rong Han, Xiaohong Liu, Tong Pan et al.
LuxDiT: Lighting Estimation with Video Diffusion Transformer
Ruofan Liang, Kai He, Zan Gojcic et al.
CTRL-D: Controllable Dynamic 3D Scene Editing with Personalized 2D Diffusion
Kai He, Chin-Hsuan Wu, Igor Gilitschenski
SQLens: An End-to-End Framework for Error Detection and Correction in Text-to-SQL
Yue Gong, Chuan Lei, Xiao Qin et al.
Cyclic Contrastive Knowledge Transfer for Open-Vocabulary Object Detection
Chuhan ZHANG, Chaoyang Zhu, Pingcheng Dong et al.
On Speeding Up Language Model Evaluation
Jin Zhou, Christian Belardi, Ruihan Wu et al.
ECHOPulse: ECG Controlled Echocardio-gram Video Generation
Yiwei Li, Sekeun Kim, Zihao Wu et al.
RLPF: Reinforcement Learning from Prediction Feedback for User Summarization with LLMs
Jiaxing Wu, Lin Ning, Luyang Liu et al.
Omnidirectional Multi-Object Tracking
Kai Luo, Hao Shi, Sheng Wu et al.
Bayesian WeakS-to-Strong from Text Classification to Generation
Ziyun Cui, Ziyang Zhang, Guangzhi Sun et al.
Purifying Shampoo: Investigating Shampoo's Heuristics by Decomposing its Preconditioner
Runa Eschenhagen, Aaron Defazio, Tsung-Hsien Lee et al.
LiteSearch: Efficient Tree Search with Dynamic Exploration Budget for Math Reasoning
Ante Wang, Linfeng Song, Ye Tian et al.
MrSteve: Instruction-Following Agents in Minecraft with What-Where-When Memory
Junyeong Park, Junmo Cho, Sungjin Ahn
Multi-identity Human Image Animation with Structural Video Diffusion
Zhenzhi Wang, Yixuan Li, yanhong zeng et al.
Self-Evolutionary Large Language Models Through Uncertainty-Enhanced Preference Optimization
Jianing Wang, Yang Zhou, Xiaocheng Zhang et al.
GENMANIP: LLM-driven Simulation for Generalizable Instruction-Following Manipulation
Ning Gao, Yilun Chen, Shuai Yang et al.
Calibrating Expressions of Certainty
Peiqi Wang, Barbara Lam, Yingcheng Liu et al.
On Generalization Across Environments In Multi-Objective Reinforcement Learning
Jayden Teoh, Pradeep Varakantham, Peter Vamplew
Advancing Semantic Future Prediction through Multimodal Visual Sequence Transformers
Efstathios Karypidis, Ioannis Kakogeorgiou, Spyros Gidaris et al.
Enhancing Testing-Time Robustness for Trusted Multi-View Classification in the Wild
Wei Liu, Yufei Chen, Xiaodong Yue
Few-Shot, No Problem: Descriptive Continual Relation Extraction
Nguyen Xuan Thanh, Anh Duc Le, Quyen Tran et al.
Reinforcement learning with combinatorial actions for coupled restless bandits
Lily Xu, Bryan Wilder, Elias Khalil et al.
Causal LLM Routing: End-to-End Regret Minimization from Observational Data
Asterios Tsiourvas, Wei Sun, Georgia Perakis
GARLIC: GPT-Augmented Reinforcement Learning with Intelligent Control for Vehicle Dispatching
Xiao Han, Zijian Zhang, Xiangyu Zhao et al.
Pos3R: 6D Pose Estimation for Unseen Objects Made Easy
Weijian Deng, Dylan Campbell, Chunyi Sun et al.
Learning to Communicate Through Implicit Communication Channels
Han Wang, Binbin Chen, zhang et al.
Language Models Can Predict Their Own Behavior
Dhananjay Ashok, Jonathan May
Generalized Diffusion Detector: Mining Robust Features from Diffusion Models for Domain-Generalized Detection
Boyong He, Yuxiang Ji, Qianwen Ye et al.
OCRT: Boosting Foundation Models in the Open World with Object-Concept-Relation Triad
Luyao Tang, Chaoqi Chen, Yuxuan Yuan et al.
Multi-Modal View Enhanced Large Vision Models for Long-Term Time Series Forecasting
ChengAo Shen, Wenchao Yu, Ziming Zhao et al.
Infer Human’s Intentions Before Following Natural Language Instructions
Yanming Wan, Yue Wu, Yiping Wang et al.
Learning Visual Generative Priors without Text
Shuailei Ma, Kecheng Zheng, Ying Wei et al.
STAR: Stability-Inducing Weight Perturbation for Continual Learning
Masih Eskandar, Tooba Imtiaz, Davin Hill et al.
Montessori-Instruct: Generate Influential Training Data Tailored for Student Learning
Xiaochuan Li, Zichun Yu, Chenyan Xiong
MAtCha Gaussians: Atlas of Charts for High-Quality Geometry and Photorealism From Sparse Views
Antoine Guédon, Tomoki Ichikawa, Kohei Yamashita et al.
ProbeSDF: Light Field Probes For Neural Surface Reconstruction
Briac Toussaint, Diego Thomas, Jean-Sébastien Franco
HUSH: Holistic Panoramic 3D Scene Understanding using Spherical Harmonics
Jongsung Lee, HARIN PARK, Byeong-Uk Lee et al.
Can LLMs Obfuscate Code? A Systematic Analysis of Large Language Models into Assembly Code Obfuscation
Seyedreza Mohseni, Seyedali Mohammadi, Deepa Tilwani et al.
BigDocs: An Open Dataset for Training Multimodal Models on Document and Code Tasks
Juan A. Rodriguez, Xiangru Jian, Siba Smarak Panigrahi et al.
Revisiting End-to-End Learning with Slide-level Supervision in Computational Pathology
Wenhao Tang, Rong Qin, Heng Fang et al.
Zeroth-Order Fine-Tuning of LLMs with Transferable Static Sparsity
Wentao Guo, Jikai Long, Yimeng Zeng et al.
Dynamic Stereotype Theory Induced Micro-expression Recognition with Oriented Deformation
Bohao Zhang, Xuejiao Wang, Changbo Wang et al.
Memorize and Rank: Elevating Large Language Models for Clinical Diagnosis Prediction
Mingyu Derek Ma, Xiaoxuan Wang, Yijia Xiao et al.
LUCAS: Layered Universal Codec Avatars
Di Liu, Teng Deng, Giljoo Nam et al.
FDS: Frequency-Aware Denoising Score for Text-Guided Latent Diffusion Image Editing
Yufan Ren, Zicong Jiang, Tong Zhang et al.
Rethinking Fair Representation Learning for Performance-Sensitive Tasks
Charles Jones, Fabio De Sousa Ribeiro, Mélanie Roschewitz et al.
Improving Adversarial Transferability on Vision Transformers via Forward Propagation Refinement
Yuchen Ren, Zhengyu Zhao, Chenhao Lin et al.
DrVD-Bench: Do Vision-Language Models Reason Like Human Doctors in Medical Image Diagnosis?
Tianhong Zhou, xu yin, Yingtao Zhu et al.
Accelerating 3D Molecule Generation via Jointly Geometric Optimal Transport
Haokai Hong, Wanyu LIN, KC Tan
Learning Orthogonal Multi-Index Models: A Fine-Grained Information Exponent Analysis
Yunwei Ren, Jason Lee
Understanding the Generalization of In-Context Learning in Transformers: An Empirical Study
Xingxuan Zhang, Haoran Wang, Jiansheng Li et al.
CMT: A Memory Compression Method for Continual Knowledge Learning of Large Language Models
Dongfang Li, Zetian Sun, Xinshuo Hu et al.
HeGTa: Leveraging Heterogeneous Graph-enhanced Large Language Models for Few-shot Complex Table Understanding
Rihui Jin, Yu Li, Guilin Qi et al.
Detection-Friendly Nonuniformity Correction: A Union Framework for Infrared UAV Target Detection
Houzhang Fang, Xiaolin Wang, Zengyang Li et al.
CAT: Content-Adaptive Image Tokenization
Junhong Shen, Kushal Tirumala, Michihiro Yasunaga et al.
MeshCoder: LLM-Powered Structured Mesh Code Generation from Point Clouds
Bingquan Dai, Luo Li, Qihong Tang et al.
Compliant Residual DAgger: Improving Real-World Contact-Rich Manipulation with Human Corrections
Xiaomeng Xu, Yifan Hou, Zeyi Liu et al.
From Probability to Counterfactuals: the Increasing Complexity of Satisfiability in Pearl's Causal Hierarchy
Julian Dörfler, Benito van der Zander, Markus Bläser et al.
Pushing the Limits of All-Atom Geometric Graph Neural Networks: Pre-Training, Scaling, and Zero-Shot Transfer
Zihan Pengmei, Zhengyuan Shen, Zichen Wang et al.
Context-Enhanced Memory-Refined Transformer for Online Action Detection
Zhanzhong Pang, Fadime Sener, Angela Yao
Modeling Thousands of Human Annotators for Generalizable Text-to-Image Person Re-identification
Jiayu Jiang, Changxing Ding, Wentao Tan et al.
Prosody-Enhanced Acoustic Pre-training and Acoustic-Disentangled Prosody Adapting for Movie Dubbing
Zhedong Zhang, Liang Li, Chenggang Yan et al.
CITI: Enhancing Tool Utilizing Ability in Large Language Models Without Sacrificing General Performance
Yupu Hao, Pengfei Cao, Zhuoran Jin et al.
On the Zero-shot Adversarial Robustness of Vision-Language Models: A Truly Zero-shot and Training-free Approach
Baoshun Tong, Hanjiang Lai, Yan Pan et al.
3DEnhancer: Consistent Multi-View Diffusion for 3D Enhancement
Yihang Luo, Shangchen Zhou, Yushi Lan et al.
Preference-Oriented Supervised Fine-Tuning: Favoring Target Model over Aligned Large Language Models
Yuchen Fan, Yuzhong Hong, Qiushi Wang et al.
VLMaterial: Procedural Material Generation with Large Vision-Language Models
Beichen Li, Rundi Wu, Armando Solar-Lezama et al.
Concept Replacer: Replacing Sensitive Concepts in Diffusion Models via Precision Localization
lingyun zhang, Yu Xie, Yanwei Fu et al.
SMI-Editor: Edit-based SMILES Language Model with Fragment-level Supervision
Kangjie Zheng, Siyue Liang, Junwei Yang et al.
Predicting Empirical AI Research Outcomes with Language Models
Jiaxin Wen, Chenglei Si, Yueh-Han Chen et al.
Understanding Fine-tuning CLIP for Open-vocabulary Semantic Segmentation in Hyperbolic Space
Zelin Peng, Zhengqin Xu, Zhilin Zeng et al.
OnlineAnySeg: Online Zero-Shot 3D Segmentation by Visual Foundation Model Guided 2D Mask Merging
Yijie Tang, Jiazhao Zhang, Yuqing Lan et al.
BitStack: Any-Size Compression of Large Language Models in Variable Memory Environments
Xinghao Wang, Pengyu Wang, Bo Wang et al.
GLoRa: A Benchmark to Evaluate the Ability to Learn Long-Range Dependencies in Graphs
Dongzhuoran Zhou, Evgeny Kharlamov, Egor Kostylev
CoMBO: Conflict Mitigation via Branched Optimization for Class Incremental Segmentation
Kai Fang, Anqi Zhang, Guangyu Gao et al.
Question-Aware Gaussian Experts for Audio-Visual Question Answering
Hongyeob Kim, Inyoung Jung, Dayoon Suh et al.
Revisiting Source-Free Domain Adaptation: a New Perspective via Uncertainty Control
Gezheng Xu, Hui GUO, Li Yi et al.
Finding Local Diffusion Schrödinger Bridge using Kolmogorov-Arnold Network
Xingyu Qiu, Mengying Yang, Xinghua Ma et al.
Object-aware Sound Source Localization via Audio-Visual Scene Understanding
Sung Jin Um, Dongjin Kim, Sangmin Lee et al.
LNS2+RL: Combining Multi-agent Reinforcement Learning with Large Neighborhood Search in Multi-agent Path Finding
Yutong Wang, Tanishq Duhan, Jiaoyang Li et al.
Many-Objective Multi-Solution Transport
Ziyue Li, Tian Li, Virginia Smith et al.
PanDA: Towards Panoramic Depth Anything with Unlabeled Panoramas and Mobius Spatial Augmentation
Zidong Cao, Jinjing Zhu, Weiming Zhang et al.
Towards Understanding Text Hallucination of Diffusion Models via Local Generation Bias
Rui Lu, Runzhe Wang, Kaifeng Lyu et al.
Cropper: Vision-Language Model for Image Cropping through In-Context Learning
Seung Hyun Lee, Jijun jiang, Yiran Xu et al.
FedSA: A Unified Representation Learning via Semantic Anchors for Prototype-based Federated Learning
Yanbing Zhou, Xiangmou Qu, Chenlong You et al.
Model-Free Offline Reinforcement Learning with Enhanced Robustness
Chi Zhang, Zain Ulabedeen Farhat, George Atia et al.
Spatial Transport Optimization by Repositioning Attention Map for Training-Free Text-to-Image Synthesis
Woojung Han, Yeonkyung Lee, Chanyoung Kim et al.
Ego4o: Egocentric Human Motion Capture and Understanding from Multi-Modal Input
Jian Wang, Rishabh Dabral, Diogo Luvizon et al.
ProtCLIP: Function-Informed Protein Multi-Modal Learning
Hanjing Zhou, Mingze Yin, Wei Wu et al.
Cocoon: Robust Multi-Modal Perception with Uncertainty-Aware Sensor Fusion
Minkyoung Cho, Yulong Cao, Jiachen Sun et al.
A Simple Approach to Unifying Diffusion-based Conditional Generation
Xirui Li, Charles Herrmann, Kelvin Chan et al.
Beyond FVD: An Enhanced Evaluation Metrics for Video Generation Distribution Quality
Ge Ya Luo, Gian M Favero, Zhi Hao Luo et al.
Specifying What You Know or Not for Multi-Label Class-Incremental Learning
Aoting Zhang, Dongbao Yang, Chang Liu et al.
Lawma: The Power of Specialization for Legal Annotation
Ricardo Dominguez-Olmedo, Vedant Nanda, Rediet Abebe et al.
Can Students Beyond the Teacher? Distilling Knowledge from Teacher’s Bias
Jianhua Zhang, Yi Gao, Ruyu Liu et al.
Convergence of Clipped SGD on Convex $(L_0,L_1)$-Smooth Functions
Ofir Gaash, Kfir Y. Levy, Yair Carmon
IterIS: Iterative Inference-Solving Alignment for LoRA Merging
Hongxu chen, Zhen Wang, Runshi Li et al.
Conformal Generative Modeling with Improved Sample Efficiency through Sequential Greedy Filtering
Klaus-Rudolf Kladny, Bernhard Schölkopf, Michael Muehlebach
Learning-Guided Rolling Horizon Optimization for Long-Horizon Flexible Job-Shop Scheduling
Sirui Li, Wenbin Ouyang, Yining Ma et al.
When Are Concepts Erased From Diffusion Models?
Kevin Lu, Nicky Kriplani, Rohit Gandikota et al.
Feature Clipping for Uncertainty Calibration
Linwei Tao, Minjing Dong, Chang Xu
EchoONE: Segmenting Multiple Echocardiography Planes in One Model
Jiongtong Hu, Wei Zhuo, Jun Cheng et al.
LeanVAE: An Ultra-Efficient Reconstruction VAE for Video Diffusion Models
Yu Cheng, Fajie Yuan
Seeing What Matters: Empowering CLIP with Patch Generation-to-Selection
Gensheng Pei, Tao Chen, Yujia Wang et al.
Poplar: Efficient Scaling of Distributed DNN Training on Heterogeneous GPU Clusters
WenZheng Zhang, Yang Hu, Jing Shi et al.
Skip-Vision: Efficient and Scalable Acceleration of Vision-Language Models via Adaptive Token Skipping
Weili Zeng, Ziyuan Huang, Kaixiang Ji et al.
MOSCATO: Predicting Multiple Object State Change Through Actions
Parnian Zameni, Yuhan Shen, Ehsan Elhamifar
The Fluorescent Veil: A Stealthy and Effective Physical Adversarial Patch Against Traffic Sign Recognition
Shuai Yuan, Xingshuo Han, Hongwei Li et al.
Spiral: Semantic-Aware Progressive LiDAR Scene Generation and Understanding
Dekai Zhu, Yixuan Hu, Youquan Liu et al.
Improved Balanced Classification with Theoretically Grounded Loss Functions
Corinna Cortes, Mehryar Mohri, Yutao Zhong
Difficulty-aware Balancing Margin Loss for Long-tailed Recognition
Minseok Son, Inyong Koo, Jinyoung Park et al.
Privacy-Preserving Personalized Federated Prompt Learning for Multimodal Large Language Models
Linh Tran, Wei Sun, Stacy Patterson et al.
Uncertainty Quantification with the Empirical Neural Tangent Kernel
Joseph Wilson, Chris van der Heide, Liam Hodgkinson et al.
SEPARATE: A Simple Low-rank Projection for Gradient Compression in Modern Large-scale Model Training Process
Hanzhen Zhao, Xingyu Xie, Cong Fang et al.
Time-o1: Time-Series Forecasting Needs Transformed Label Alignment
Hao Wang, Licheng Pan, Zhichao Chen et al.
Harmonizing Visual and Textual Embeddings for Zero-Shot Text-to-Image Customization
Yeji Song, Jimyeong Kim, Wonhark Park et al.
Interpretable Generative Models through Post-hoc Concept Bottlenecks
Akshay R. Kulkarni, Ge Yan, Chung-En Sun et al.
Optimal Non-Asymptotic Rates of Value Iteration for Average-Reward Markov Decision Processes
Jongmin Lee, Ernest Ryu
Uncertainty and Influence aware Reward Model Refinement for Reinforcement Learning from Human Feedback
Zexu Sun, Yiju Guo, Yankai Lin et al.
Novel View Synthesis with Pixel-Space Diffusion Models
Noam Elata, Bahjat Kawar, Yaron Ostrovsky-Berman et al.
Distilled Prompt Learning for Incomplete Multimodal Survival Prediction
Yingxue Xu, Fengtao ZHOU, Chenyu Zhao et al.
Exact Expressive Power of Transformers with Padding
Will Merrill, Ashish Sabharwal
BiggerGait: Unlocking Gait Recognition with Layer-wise Representations from Large Vision Models
Dingqiang Ye, Chao Fan, Zhanbo Huang et al.
Reverse Diffusion Sequential Monte Carlo Samplers
Luhuan Wu, Yi Han, Christian Andersson Naesseth et al.
Progressive Compression with Universally Quantized Diffusion Models
Yibo Yang, Justus Will, Stephan Mandt
Flavors of Margin: Implicit Bias of Steepest Descent in Homogeneous Neural Networks
Nikolaos Tsilivis, Gal Vardi, Julia Kempe
Multilevel neural simulation-based inference
Yuga Hikida, Ayush Bharti, Niall Jeffrey et al.
Disentangled World Models: Learning to Transfer Semantic Knowledge from Distracting Videos for Reinforcement Learning
Qi Wang, Zhipeng Zhang, Baao Xie et al.
Think Thrice Before You Act: Progressive Thought Refinement in Large Language Models
Chengyu Du, Jinyi Han, Yizhou Ying et al.
Shape it Up! Restoring LLM Safety during Finetuning
ShengYun Peng, Pin-Yu Chen, Jianfeng Chi et al.
FedSPU: Personalized Federated Learning for Resource-Constrained Devices with Stochastic Parameter Update
Ziru Niu, Hai Dong, A. K. Qin
Multimodal Variational Autoencoder: A Barycentric View
Peijie Qiu, Wenhui Zhu, Sayantan Kumar et al.
Noise-Resilient Symbolic Regression with Dynamic Gating Reinforcement Learning
Chenglu Sun, Shuo Shen, Wenzhi Tao et al.
InfinityStar: Unified Spacetime AutoRegressive Modeling for Visual Generation
Jinlai Liu, Jian Han, Bin Yan et al.
CSA: Data-efficient Mapping of Unimodal Features to Multimodal Features
Po-han Li, Sandeep Chinchali, ufuk topcu
A Generic Framework for Conformal Fairness
Aditya Vadlamani, Anutam Srinivasan, Pranav Maneriker et al.
Learning Normal Flow Directly From Events
Dehao Yuan, Levi Burner, Jiayi Wu et al.
SwitchLingua: The First Large-Scale Multilingual and Multi-Ethnic Code-Switching Dataset
Peng Xie, Xingyuan Liu, Yequan Bie et al.
A Simple Graph Contrastive Learning Framework for Short Text Classification
Yonghao Liu, Fausto Giunchiglia, Lan Huang et al.
SweetTok: Semantic-Aware Spatial-Temporal Tokenizer for Compact Video Discretization
Zhentao Tan, Ben Xue, Jian Jia et al.
DeblurDiff: Real-Word Image Deblurring with Generative Diffusion Models
Lingshun Kong, Jiawei Zhang, Dongqing Zou et al.
Rectifying Magnitude Neglect in Linear Attention
Qihang Fan, Huaibo Huang, Yuang Ai et al.
Towards Omnimodal Expressions and Reasoning in Referring Audio-Visual Segmentation
Kaining Ying, Henghui Ding, Guangquan Jie et al.
Subgraph Aggregation for Out-of-Distribution Generalization on Graphs
Bowen Liu, Haoyang Li, Shuning Wang et al.
MOFFlow: Flow Matching for Structure Prediction of Metal-Organic Frameworks
Nayoung Kim, Seongsu Kim, Minsu Kim et al.
Boltzmann-Aligned Inverse Folding Model as a Predictor of Mutational Effects on Protein-Protein Interactions
Xiaoran Jiao, Weian Mao, Wengong Jin et al.