Most Cited 2025 "private estimators" Papers
22,274 papers found • Page 29 of 112
Conference
JointDiT: Enhancing RGB-Depth Joint Modeling with Diffusion Transformers
Kwon Byung-Ki, Qi Dai, Lee Hyoseok et al.
Object-Centric Prompt-Driven Vision-Language-Action Model for Robotic Manipulation
Xiaoqi Li, Lingyun Xu, Mingxu Zhang et al.
Transformer brain encoders explain human high-level visual responses
Hossein Adeli, Sun Minni, Nikolaus Kriegeskorte
Evaluating Vision-Language Models as Evaluators in Path Planning
Mohamed Aghzal, Xiang Yue, Erion Plaku et al.
GS2E: Gaussian Splatting is an Effective Data Generator for Event Stream Generation
Yuchen Li, Chaoran Feng, Zhenyu Tang et al.
Motion Synthesis with Sparse and Flexible Keyjoint Control
Inwoo Hwang, Jinseok Bae, Donggeun Lim et al.
FlashSloth : Lightning Multimodal Large Language Models via Embedded Visual Compression
Bo Tong, Bokai Lai, Yiyi Zhou et al.
Secret Lies in Color: Enhancing AI-Generated Images Detection with Color Distribution Analysis
Zexi Jia, Chuanwei Huang, Yeshuang Zhu et al.
On the Convergence of Projected Policy Gradient for Any Constant Step Sizes
Jiacai Liu, Wenye Li, Dachao Lin et al.
Fractal Calibration for Long-tailed Object Detection
Konstantinos Alexandridis, Ismail Elezi, Jiankang Deng et al.
Learning to Detect Objects from Multi-Agent LiDAR Scans without Manual Labels
Qiming Xia, Wenkai Lin, Haoen Xiang et al.
U-Know-DiffPAN: An Uncertainty-aware Knowledge Distillation Diffusion Framework with Details Enhancement for PAN-Sharpening
Sungpyo Kim, Jeonghyeok Do, Jaehyup Lee et al.
iSegMan: Interactive Segment-and-Manipulate 3D Gaussians
Yian Zhao, Wanshi Xu, Ruochong Zheng et al.
Scalable Autoregressive Monocular Depth Estimation
Jinhong Wang, Jintai Chen, Jian liu et al.
VGGSounder: Audio-Visual Evaluations for Foundation Models
Daniil Zverev, Thaddäus Wiedemer, Ameya Prabhu et al.
WISE: A Framework for Gigapixel Whole-Slide-Image Lossless Compression
Yu Mao, Jun Wang, Nan Guan et al.
OuroMamba: A Data-Free Quantization Framework for Vision Mamba
Akshat Ramachandran, Mingyu Lee, Huan Xu et al.
Riemannian Flow Matching for Brain Connectivity Matrices via Pullback Geometry
Antoine Collas, Ce Ju, Nicolas Salvy et al.
Unbiased Video Scene Graph Generation via Visual and Semantic Dual Debiasing
Yanjun Li, Zhaoyang Li, Honghui Chen et al.
ExGra-Med: Extended Context Graph Alignment for Medical Vision-Language Models
Duy M. H. Nguyen, Nghiem Diep, Trung Nguyen et al.
DocThinker: Explainable Multimodal Large Language Models with Rule-based Reinforcement Learning for Document Understanding
Wenwen Yu, Zhibo Yang, Yuliang Liu et al.
Towards Robust Parameter-Efficient Fine-Tuning for Federated Learning
Xiuwen Fang, Mang Ye
Precise Action-to-Video Generation Through Visual Action Prompts
Yuang Wang, Chao Wen, Haoyu Guo et al.
MeshGen: Generating PBR Textured Mesh with Render-Enhanced Auto-Encoder and Generative Data Augmentation
Zilong Chen, Yikai Wang, Wenqiang Sun et al.
Radio Frequency Ray Tracing with Neural Object Representation for Enhanced RF Modeling
Xingyu Chen, Zihao Feng, Kun Qian et al.
On the Loss of Context Awareness in General Instruction Fine-tuning
Yihan Wang, Andrew Bai, Nanyun Peng et al.
Focus-N-Fix: Region-Aware Fine-Tuning for Text-to-Image Generation
Xiaoying Xing, Avinab Saha, Junfeng He et al.
Estimation and Inference in Distributional Reinforcement Learning
Liangyu Zhang, Yang Peng, Jiadong Liang et al.
Balanced Image Stylization with Style Matching Score
Yuxin Jiang, Liming Jiang, Shuai Yang et al.
Introducing FOReCAst: The Future Outcome Reasoning and Confidence Assessment Benchmark
Zhangdie Yuan, Zifeng Ding, Andreas Vlachos
Reasoning to Attend: Try to Understand How <SEG> Token Works
Rui Qian, Xin Yin, Dejing Dou
VertexRegen: Mesh Generation with Continuous Level of Detail
Xiang Zhang, Yawar Siddiqui, Armen Avetisyan et al.
MultiVerse: A Multi-Turn Conversation Benchmark for Evaluating Large Vision and Language Models
Young-Jun Lee, Byung-Kwan Lee, Jianshu Zhang et al.
VoteFlow: Enforcing Local Rigidity in Self-Supervised Scene Flow
Yancong Lin, Shiming Wang, Liangliang Nan et al.
Learning Affine Correspondences by Integrating Geometric Constraints
Pengju Sun, Banglei Guan, Zhenbao Yu et al.
Neural Collapse is Globally Optimal in Deep Regularized ResNets and Transformers
Peter Súkeník, Christoph Lampert, Marco Mondelli
PointLoRA: Low-Rank Adaptation with Token Selection for Point Cloud Learning
Song Wang, Xiaolu Liu, Lingdong Kong et al.
Signal and Noise: A Framework for Reducing Uncertainty in Language Model Evaluation
David Heineman, Valentin Hofmann, Ian Magnusson et al.
VideoHandles: Editing 3D Object Compositions in Videos Using Video Generative Priors
Juil Koo, Paul Guerrero, Chun-Hao P. Huang et al.
Free-Form Motion Control: Controlling the 6D Poses of Camera and Objects in Video Generation
Xincheng Shuai, Henghui Ding, Zhenyuan Qin et al.
Towards All-in-One Medical Image Re-Identification
Yuan Tian, Kaiyuan Ji, Rongzhao Zhang et al.
Know What You Don't Know: Uncertainty Calibration of Process Reward Models
Young-Jin Park, Kristjan Greenewald, Kaveh Alimohammadi et al.
dFLMoE: Decentralized Federated Learning via Mixture of Experts for Medical Data Analysis
Luyuan Xie, Tianyu Luan, Wenyuan Cai et al.
ATP: Adaptive Threshold Pruning for Efficient Data Encoding in Quantum Neural Networks
Mohamed Afane, Gabrielle Ebbrecht, Ying Wang et al.
HyperSeg: Hybrid Segmentation Assistant with Fine-grained Visual Perceiver
Cong Wei, Haoxian Tan, Yujie Zhong et al.
HAMSt3R: Human-Aware Multi-view Stereo 3D Reconstruction
Sara Rojas Martinez, Matthieu Armando, Bernard Ghanem et al.
Dynamic Derivation and Elimination: Audio Visual Segmentation with Enhanced Audio Semantics
Chen Liu, Liying Yang, Peike Li et al.
Steady Progress Beats Stagnation: Mutual Aid of Foundation and Conventional Models in Mixed Domain Semi-Supervised Medical Image Segmentation
Qinghe Ma, Jian Zhang, Zekun Li et al.
Precise Event Spotting in Sports Videos: Solving Long-Range Dependency and Class Imbalance
Sanchayan Santra, Vishal Chudasama, Pankaj Wasnik et al.
EvaLearn: Quantifying the Learning Capability and Efficiency of LLMs via Sequential Problem Solving
Shihan Dou, Ming Zhang, Chenhao Huang et al.
$\texttt{BetaConform}$: Efficient MAP Estimation of LLM Ensemble Judgment Performance with Prior Transfer
Huaizhi Qu, Inyoung Choi, Zhen Tan et al.
Hierarchical Implicit Neural Emulators
Ruoxi Jiang, Xiao Zhang, Karan Jakhar et al.
DualEqui: A Dual-Space Hierarchical Equivariant Network for Large Biomolecules
Junjie Xu, Jiahao Zhang, Mangal Prakash et al.
Learning to Highlight Audio by Watching Movies
Chao Huang, Ruohan Gao, J. M. F. Tsang et al.
SAM4D: Segment Anything in Camera and LiDAR Streams
Jianyun Xu, Song Wang, Ziqian Ni et al.
KORGym: A Dynamic Game Platform for LLM Reasoning Evaluation
Jiajun Shi, Jian Yang, Jiaheng Liu et al.
Flexible MOF Generation with Torsion-Aware Flow Matching
Nayoung Kim, Seongsu Kim, Sungsoo Ahn
SparseRecon: Neural Implicit Surface Reconstruction from Sparse Views with Feature and Depth Consistencies
Liang Han, Xu Zhang, Haichuan Song et al.
Talk2Event: Grounded Understanding of Dynamic Scenes from Event Cameras
Lingdong Kong, Dongyue Lu, Alan Liang et al.
Learning Audio-guided Video Representation with Gated Attention for Video-Text Retrieval
Boseung Jeong, Jicheol Park, Sungyeon Kim et al.
TrajAgent: An LLM-Agent Framework for Trajectory Modeling via Large-and-Small Model Collaboration
Yuwei Du, Jie Feng, Jie Zhao et al.
Full-DoF Egomotion Estimation for Event Cameras Using Geometric Solvers
Ji Zhao, Banglei Guan, Zibin Liu et al.
MoEdit: On Learning Quantity Perception for Multi-object Image Editing
Yanfeng Li, Ka-Hou Chan, Yue Sun et al.
Ev-3DOD: Pushing the Temporal Boundaries of 3D Object Detection with Event Cameras
Hoonhee Cho, Jae-Young Kang, Youngho Kim et al.
Universal Scene Graph Generation
Shengqiong Wu, Hao Fei, Tat-seng Chua
SAM-I2V: Upgrading SAM to Support Promptable Video Segmentation with Less than 0.2% Training Cost
Haiyang Mei, Pengyu Zhang, Mike Zheng Shou
FALCON: An ML Framework for Fully Automated Layout-Constrained Analog Circuit Design
Asal Mehradfar, Xuzhe Zhao, Yilun Huang et al.
ForestLPR: LiDAR Place Recognition in Forests Attentioning Multiple BEV Density Images
Yanqing Shen, Turcan Tuna, Marco Hutter et al.
InterDyn: Controllable Interactive Dynamics with Video Diffusion Models
Rick Akkerman, Haiwen Feng, Michael J. Black et al.
OptiScene: LLM-driven Indoor Scene Layout Generation via Scaled Human-aligned Data Synthesis and Multi-Stage Preference Optimization
Yixuan Yang, Zhen Luo, Tongsheng Ding et al.
Elevating Visual Perception in Multimodal LLMs with Visual Embedding Distillation
Jitesh Jain, Zhengyuan Yang, Humphrey Shi et al.
SViM3D: Stable Video Material Diffusion for Single Image 3D Generation
Andreas Engelhardt, Mark Boss, Vikram Voleti et al.
DistillDrive: End-to-End Multi-Mode Autonomous Driving Distillation by Isomorphic Hetero-Source Planning Model
Rui Yu, Xianghang Zhang, Runkai Zhao et al.
A Token-level Text Image Foundation Model for Document Understanding
Tongkun Guan, Zining Wang, Pei Fu et al.
GS-ID: Illumination Decomposition on Gaussian Splatting via Adaptive Light Aggregation and Diffusion-Guided Material Priors
Kang DU, Zhihao Liang, Yulin Shen et al.
Thought Communication in Multiagent Collaboration
Yujia Zheng, Zhuokai Zhao, Zijian Li et al.
GS-DiT: Advancing Video Generation with Dynamic 3D Gaussian Fields through Efficient Dense 3D Point Tracking
Weikang Bian, Zhaoyang Huang, Xiaoyu Shi et al.
DEAL: Data-Efficient Adversarial Learning for High-Quality Infrared Imaging
Zhu Liu, Zijun Wang, Jinyuan Liu et al.
Are Images Indistinguishable to Humans Also Indistinguishable to Classifiers?
Zebin You, Xinyu Zhang, Hanzhong Guo et al.
PolyVivid: Vivid Multi-Subject Video Generation with Cross-Modal Interaction and Enhancement
Teng Hu, Zhentao Yu, Zhengguang Zhou et al.
DH-FaceVid-1K: A Large-Scale High-Quality Dataset for Face Video Generation
Donglin Di, He Feng, Wenzhang SUN et al.
Learning 3D Object Spatial Relationships from Pre-trained 2D Diffusion Models
Sangwon Baik, Hyeonwoo Kim, Hanbyul Joo
GuardSplat: Efficient and Robust Watermarking for 3D Gaussian Splatting
Zixuan Chen, Guangcong Wang, Jiahao Zhu et al.
MonoMVSNet: Monocular Priors Guided Multi-View Stereo Network
Jianfei Jiang, Qiankun Liu, Haochen Yu et al.
SMoLoRA: Exploring and Defying Dual Catastrophic Forgetting in Continual Visual Instruction Tuning
Ziqi Wang, Chang Che, Qi Wang et al.
Capturing Polysemanticity with PRISM: A Multi-Concept Feature Description Framework
Laura Kopf, Nils Feldhus, Kirill Bykov et al.
EAMamba: Efficient All-Around Vision State Space Model for Image Restoration
Yu-Cheng Lin, Yu-Syuan Xu, Hao-Wei Chen et al.
7DGS: Unified Spatial-Temporal-Angular Gaussian Splatting
Zhongpai Gao, Benjamin Planche, Meng Zheng et al.
EquiTabPFN: A Target-Permutation Equivariant Prior Fitted Network
Michael Arbel, David Salinas, Frank Hutter
HyperLoRA: Parameter-Efficient Adaptive Generation for Portrait Synthesis
Mengtian Li, Jinshu Chen, Wanquan Feng et al.
SPRINT: Enabling Interleaved Planning and Parallelized Execution in Reasoning Models
Emil Biju, Shayan Talaei, Zhemin Huang et al.
A Data-Centric Revisit of Pre-Trained Vision Models for Robot Learning
Xin Wen, Bingchen Zhao, Yilun Chen et al.
CAVIS: Context-Aware Video Instance Segmentation
Seunghun Lee, Jiwan Seo, Kiljoon Han et al.
Advanced Sign Language Video Generation with Compressed and Quantized Multi-Condition Tokenization
Cong Wang, Zexuan Deng, Zhiwei Jiang et al.
ToolVQA: A Dataset for Multi-step Reasoning VQA with External Tools
Shaofeng Yin, Ting Lei, Yang Liu
On Inductive Biases That Enable Generalization in Diffusion Transformers
Jie An, De Wang, Pengsheng Guo et al.
StochasticSplats: Stochastic Rasterization for Sorting-Free 3D Gaussian Splatting
Shakiba Kheradmand, Delio Vicini, George Kopanas et al.
TurboTrain: Towards Efficient and Balanced Multi-Task Learning for Multi-Agent Perception and Prediction
Zewei Zhou, Zhihao Zhao, Tianhui Cai et al.
Rectified Point Flow: Generic Point Cloud Pose Estimation
Tao Sun, Liyuan Zhu, Shengyu Huang et al.
BIGS: Bimanual Category-agnostic Interaction Reconstruction from Monocular Videos via 3D Gaussian Splatting
Jeongwan On, Kyeonghwan Gwak, Gunyoung Kang et al.
SP2T: Sparse Proxy Attention for Dual-stream Point Transformer
Jiaxu Wan, Hong Zhang, Ziqi He et al.
Simplification Is All You Need against Out-of-Distribution Overconfidence
Keke Tang, Chao Hou, Weilong Peng et al.
On Denoising Walking Videos for Gait Recognition
Dongyang Jin, Chao Fan, Jingzhe Ma et al.
Exploring Diffusion Transformer Designs via Grafting
Keshigeyan Chandrasegaran, Michael Poli, Dan Fu et al.
DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles
Rui Zhao, Weijia Mao, Mike Zheng Shou
GIVEPose: Gradual Intra-class Variation Elimination for RGB-based Category-Level Object Pose Estimation
Ziqin Huang, Gu Wang, Chenyangguang Zhang et al.
SVG-IR: Spatially-Varying Gaussian Splatting for Inverse Rendering
Hanxiao Sun, Yupeng Gao, Jin Xie et al.
CASAGPT: Cuboid Arrangement and Scene Assembly for Interior Design
Weitao Feng, Hang Zhou, Jing Liao et al.
Robust Multi-View Learning via Representation Fusion of Sample-Level Attention and Alignment of Simulated Perturbation
Jie Xu, Na Zhao, Gang Niu et al.
Learning to Unlearn while Retaining: Combating Gradient Conflicts in Machine Unlearning
Gaurav Patel, Qiang Qiu
The Overthinker's DIET: Cutting Token Calories with DIfficulty-AwarE Training
Weize Chen, Jiarui yuan, Jin Tailin et al.
Multi-Label Prototype Visual Spatial Search for Weakly Supervised Semantic Segmentation
Songsong Duan, Xi Yang, Nannan Wang
Flexible Language Modeling in Continuous Space with Transformer-based Autoregressive Flows
Ruixiang Zhang, Shuangfei Zhai, Jiatao Gu et al.
Alternating Gradient Flows: A Theory of Feature Learning in Two-layer Neural Networks
Daniel Kunin, Giovanni Luca Marchetti, Feng Chen et al.
NeurIPT: Foundation Model for Neural Interfaces
Zitao Fang, Chenxuan Li, Hongting Zhou et al.
Computational Algebra with Attention: Transformer Oracles for Border Basis Algorithms
Hiroshi Kera, Nico Pelleriti, Yuki Ishihara et al.
Unleashing Diffusion Transformers for Visual Correspondence by Modulating Massive Activations
Chaofan Gan, Yuanpeng Tu, Xi Chen et al.
Learning to Generalize without Bias for Open-Vocabulary Action Recognition
Yating Yu, Congqi Cao, Yifan Zhang et al.
USP-Gaussian: Unifying Spike-based Image Reconstruction, Pose Correction and Gaussian Splatting
Kang Chen, Jiyuan Zhang, Zecheng Hao et al.
DaCapo: Score Distillation as Stacked Bridge for Fast and High-quality 3D Editing
Yufei Huang, Bangyan Liao, Yuqi Hu et al.
Enhancing Online Continual Learning with Plug-and-Play State Space Model and Class-Conditional Mixture of Discretization
Sihao Liu, Yibo Yang, Xiaojie Li et al.
Symmetry Strikes Back: From Single-Image Symmetry Detection to 3D Generation
Xiang Li, Zixuan Huang, Anh Thai et al.
Incentivizing Reasoning for Advanced Instruction-Following of Large Language Models
Yulei Qin, Gang Li, Zongyi Li et al.
Learning Linear Attention in Polynomial Time
Morris Yau, Ekin Akyürek, Jiayuan Mao et al.
HOP: Heterogeneous Topology-based Multimodal Entanglement for Co-Speech Gesture Generation
Hongye Cheng, Tianyu Wang, guangsi shi et al.
Revisiting Multi-Agent World Modeling from a Diffusion-Inspired Perspective
Yang Zhang, Xinran Li, Jianing Ye et al.
Gatekeeper: Improving Model Cascades Through Confidence Tuning
Stephan Rabanser, Nathalie Rauschmayr, Achin Kulshrestha et al.
MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering
Rushi Qiang, Yuchen Zhuang, Yinghao Li et al.
UniAP: Unifying Inter- and Intra-Layer Automatic Parallelism by Mixed Integer Quadratic Programming
Hao Lin, Ke Wu, Jie Li et al.
Anchored Diffusion Language Model
Litu Rout, Constantine Caramanis, Sanjay Shakkottai
Scaling Up Liquid-Resistance Liquid-Capacitance Networks for Efficient Sequence Modeling
Mónika Farsang, Radu Grosu
INS-MMBench: A Comprehensive Benchmark for Evaluating LVLMs' Performance in Insurance
Chenwei Lin, Hanjia Lyu, Xian Xu et al.
The Devil is in Low-Level Features for Cross-Domain Few-Shot Segmentation
Yuhan Liu, Yixiong Zou, Yuhua Li et al.
Let Humanoids Hike! Integrative Skill Development on Complex Trails
Kwan-Yee Lin, Stella X. Yu
NuiScene: Exploring Efficient Generation of Unbounded Outdoor Scenes
Han-Hung Lee, Qinghong Han, Angel Chang
Buffer Anytime: Zero-Shot Video Depth and Normal from Image Priors
Zhengfei Kuang, Tianyuan Zhang, Kai Zhang et al.
QuickSplat: Fast 3D Surface Reconstruction via Learned Gaussian Initialization
Yueh-Cheng Liu, Lukas Höllein, Matthias Nießner et al.
ETA: Efficiency through Thinking Ahead, A Dual Approach to Self-Driving with Large Models
Shadi Hamdan, Chonghao Sima, Zetong Yang et al.
Modeling Microenvironment Trajectories on Spatial Transcriptomics with NicheFlow
Kristiyan Sakalyan, Alessandro Palma, Filippo Guerranti et al.
Enhancing Partially Relevant Video Retrieval with Hyperbolic Learning
Jun Li, Jinpeng Wang, Chaolei Tan et al.
VideoLLaMB: Long Streaming Video Understanding with Recurrent Memory Bridges
Yuxuan Wang, Yiqi Song, Cihang Xie et al.
Better Language Model Inversion by Compactly Representing Next-Token Distributions
Murtaza Nazir, Matthew Finlayson, John Morris et al.
Learning single index models via harmonic decomposition
Nirmit Joshi, Hugo Koubbi, Theodor Misiakiewicz et al.
Which Data Attributes Stimulate Math and Code Reasoning? An Investigation via Influence Functions
Siqi Kou, Qingyuan Tian, Hanwen Xu et al.
Measure gradients, not activations! Enhancing neuronal activity in deep reinforcement learning
Jiashun Liu, Zihao Wu, Johan Obando Ceron et al.
Vision Function Layer in Multimodal LLMs
Cheng Shi, Yizhou Yu, Sibei Yang
LISAt: Language-Instructed Segmentation Assistant for Satellite Imagery
Jerome Quenum, Wen-Han Hsieh, Tsung-Han (Patrick) Wu et al.
Apply Hierarchical-Chain-of-Generation to Complex Attributes Text-to-3D Generation
Yiming Qin, Zhu Xu, Yang Liu
Interpretable Global Minima of Deep ReLU Neural Networks on Sequentially Separable Data
Thomas Chen, Patricia Muñoz Ewald
Chain-of-Zoom: Extreme Super-Resolution via Scale Autoregression and Preference Alignment
Bryan Sangwoo Kim, Jeongsol Kim, Jong Chul Ye
FreeScene: Mixed Graph Diffusion for 3D Scene Synthesis from Free Prompts
Tongyuan Bai, Wangyuanfan Bai, Dong Chen et al.
Taste More, Taste Better: Diverse Data and Strong Model Boost Semi-Supervised Crowd Counting
Maochen Yang, Zekun Li, Jian Zhang et al.
Segment Any-Quality Images with Generative Latent Space Enhancement
Guangqian Guo, Yong Guo, Xuehui Yu et al.
Neural Thermodynamics: Entropic Forces in Deep and Universal Representation Learning
Liu Ziyin, Yizhou Xu, Isaac Chuang
Protecting Your Video Content: Disrupting Automated Video-based LLM Annotations
Haitong Liu, Kuofeng Gao, Yang Bai et al.
Deterministic Image-to-Image Translation via Denoising Brownian Bridge Models with Dual Approximators
Bohan Xiao, PEIYONG WANG, Qisheng He et al.
Balanced Direction from Multifarious Choices: Arithmetic Meta-Learning for Domain Generalization
Xiran Wang, Jian Zhang, Lei Qi et al.
Optimized Minimal 3D Gaussian Splatting
Joo Chan Lee, Jong Hwan Ko, Eunbyung Park
SphereUFormer: A U-Shaped Transformer for Spherical 360 Perception
Yaniv Benny, Lior Wolf
Permissioned LLMs: Enforcing Access Control in Large Language Models
Bargav Jayaraman, Virendra Marathe, Hamid Mozaffari et al.
Single Domain Generalization for Few-Shot Counting via Universal Representation Matching
Xianing Chen, Si Huo, Borui Jiang et al.
ZeroSep: Separate Anything in Audio with Zero Training
Chao Huang, Yuesheng Ma, Junxuan Huang et al.
Gene Regulatory Network Inference in the Presence of Selection Bias and Latent Confounders
Gongxu Luo, Haoyue Dai, Longkang Li et al.
End-to-End Implicit Neural Representations for Classification
Alexander Gielisse, Jan van Gemert
PTDiffusion: Free Lunch for Generating Optical Illusion Hidden Pictures with Phase-Transferred Diffusion Model
Xiang Gao, Shuai Yang, Jiaying Liu
System Prompt Optimization with Meta-Learning
Yumin Choi, Jinheon Baek, Sung Ju Hwang
StelLA: Subspace Learning in Low-rank Adaptation using Stiefel Manifold
Zhizhong Li, Sina Sajadmanesh, Jingtao Li et al.
Generative Zoo
Tomasz Niewiadomski, Anastasios Yiannakidis, Hanz Cuevas Velasquez et al.
AdmTree: Compressing Lengthy Context with Adaptive Semantic Trees
Yangning Li, Shaoshen Chen, Yinghui Li et al.
Steering Generative Models with Experimental Data for Protein Fitness Optimization
Jason Yang, Wenda Chu, Daniel Khalil et al.
Temporal Unlearnable Examples: Preventing Personal Video Data from Unauthorized Exploitation by Object Tracking
Qiangqiang Wu, Yi Yu, Chenqi Kong et al.
Understanding Multi-Task Activities from Single-Task Videos
Yuhan Shen, Ehsan Elhamifar
Brain-Informed Fine-Tuning for Improved Multilingual Understanding in Language Models
Anuja Negi, SUBBAREDDY OOTA, Anwar Nunez-Elizalde et al.
Characterizing the Expressivity of Fixed-Precision Transformer Language Models
Jiaoda Li, Ryan Cotterell
Spectral State Space Model for Rotation-Invariant Visual Representation Learning
Sahar Dastani, Ali Bahri, Moslem Yazdanpanah et al.
LightSwitch: Multi-view Relighting with Material-guided Diffusion
Yehonathan Litman, Fernando De la Torre, Shubham Tulsiani
URWKV: Unified RWKV Model with Multi-state Perspective for Low-light Image Restoration
Rui Xu, Yuzhen Niu, Yuezhou Li et al.
Mono3DVLT: Monocular-Video-Based 3D Visual Language Tracking
Hongkai Wei, YANG YANG, Shijie Sun et al.
$\texttt{G1}$: Teaching LLMs to Reason on Graphs with Reinforcement Learning
Xiaojun Guo, Ang Li, Yifei Wang et al.
Conditional Panoramic Image Generation via Masked Autoregressive Modeling
Chaoyang Wang, Xiangtai Li, Lu Qi et al.
ObjectGS: Object-aware Scene Reconstruction and Scene Understanding via Gaussian Splatting
Ruijie Zhu, Mulin Yu, Linning Xu et al.
CellVerse: Do Large Language Models Really Understand Cell Biology?
Fan Zhang, Tianyu Liu, Zhihong Zhu et al.
From Style to Facts: Mapping the Boundaries of Knowledge Injection with Finetuning
Eric Zhao, Pranjal Awasthi, Nika Haghtalab
DuoLoRA : Cycle-consistent and Rank-disentangled Content-Style Personalization
Aniket Roy, Shubhankar Borse, Shreya Kadambi et al.
Order-One Rolling Shutter Cameras
Marvin Anas Hahn, Kathlén Kohn, Orlando Marigliano et al.
Top-H Decoding: Adapting the Creativity and Coherence with Bounded Entropy in Text Generation
Erfan Baghaei Potraghloo, Seyedarmin Azizi, Souvik Kundu et al.
PatchVSR: Breaking Video Diffusion Resolution Limits with Patch-wise Video Super-Resolution
Shian Du, Menghan Xia, Chang Liu et al.
Watermarking Autoregressive Image Generation
Nikola Jovanović, Ismail Labiad, Tomas Soucek et al.
Synchronized Video-to-Audio Generation via Mel Quantization-Continuum Decomposition
Juncheng Wang, Chao Xu, Cheng Yu et al.
Floxels: Fast Unsupervised Voxel Based Scene Flow Estimation
David T. Hoffmann, Syed Haseeb Raza, Hanqiu Jiang et al.
ResGS: Residual Densification of 3D Gaussian for Efficient Detail Recovery
Yanzhe Lyu, Kai Cheng, Kang Xin et al.
Test-Time Adaptation of Vision-Language Models for Open-Vocabulary Semantic Segmentation
Mehrdad Noori, David OSOWIECHI, Gustavo Vargas Hakim et al.
Morpheus: Text-Driven 3D Gaussian Splat Shape and Color Stylization
Jamie Wynn, Zawar Qureshi, Jakub Powierza et al.
Auto-Regressively Generating Multi-View Consistent Images
JiaKui Hu, Yuxiao Yang, Jialun Liu et al.
EgoThinker: Unveiling Egocentric Reasoning with Spatio-Temporal CoT
Baoqi Pei, Yifei Huang, Jilan Xu et al.
Dynamic Multimodal Prototype Learning in Vision-Language Models
Xingyu Zhu, Shuo Wang, Beier Zhu et al.
SSHNet: Unsupervised Cross-modal Homography Estimation via Problem Reformulation and Split Optimization
Junchen Yu, Siyuan Cao, Runmin Zhang et al.
PriorMotion: Generative Class-Agnostic Motion Prediction with Raster-Vector Motion Field Priors
Kangan Qian, Jinyu Miao, Xinyu Jiao et al.
EgoM2P: Egocentric Multimodal Multitask Pretraining
Gen Li, Yutong Chen, Yiqian Wu et al.
Bisecle: Binding and Separation in Continual Learning for Video Language Understanding
Yue Tan, Xiaoqian Hu, Hao Xue et al.