Most Cited 2025 "hierarchical network" Papers

22,274 papers found • Page 14 of 112

#2601

Multi-step Visual Reasoning with Visual Tokens Scaling and Verification

Tianyi Bai, Zengjie Hu, Fupeng Sun et al.

NEURIPS 2025posterarXiv:2506.07235
11
citations
#2602

MM-CamObj: A Comprehensive Multimodal Dataset for Camouflaged Object Scenarios

Jiacheng Ruan, Wenzhen Yuan, Zehao Lin et al.

AAAI 2025paperarXiv:2409.16084
11
citations
#2603

RecFlow: An Industrial Full Flow Recommendation Dataset

Qi Liu, Kai Zheng, Rui Huang et al.

ICLR 2025posterarXiv:2410.20868
11
citations
#2604

Think or Not? Exploring Thinking Efficiency in Large Reasoning Models via an Information-Theoretic Lens

Xixian Yong, Xiao Zhou, Yingying Zhang et al.

NEURIPS 2025spotlightarXiv:2505.18237
11
citations
#2605

The Optimization Landscape of SGD Across the Feature Learning Strength

Alexander Atanasov, Alexandru Meterez, James Simon et al.

ICLR 2025posterarXiv:2410.04642
11
citations
#2606

Latent Thought Models with Variational Bayes Inference-Time Computation

Deqian Kong, Minglu Zhao, Dehong Xu et al.

ICML 2025posterarXiv:2502.01567
11
citations
#2607

Training-free LLM-generated Text Detection by Mining Token Probability Sequences

Yihuai Xu, Yongwei Wang, YIFEI BI et al.

ICLR 2025oralarXiv:2410.06072
11
citations
#2608

UniMLVG: Unified Framework for Multi-view Long Video Generation with Comprehensive Control Capabilities for Autonomous Driving

Rui Chen, Zehuan Wu, Yichen Liu et al.

ICCV 2025posterarXiv:2412.04842
11
citations
#2609

ConVis: Contrastive Decoding with Hallucination Visualization for Mitigating Hallucinations in Multimodal Large Language Models

Yeji Park, Deokyeong Lee, Junsuk Choe et al.

AAAI 2025paperarXiv:2408.13906
11
citations
#2610

EMOE: Modality-Specific Enhanced Dynamic Emotion Experts

Yiyang Fang, Wenke Huang, Guancheng Wan et al.

CVPR 2025poster
11
citations
#2611

Stochastic Polyak Step-sizes and Momentum: Convergence Guarantees and Practical Performance

Dimitris Oikonomou, Nicolas Loizou

ICLR 2025posterarXiv:2406.04142
11
citations
#2612

Gazing Into Missteps: Leveraging Eye-Gaze for Unsupervised Mistake Detection in Egocentric Videos of Skilled Human Activities

Michele Mazzamuto, Antonino Furnari, Yoichi Sato et al.

CVPR 2025posterarXiv:2406.08379
11
citations
#2613

Stabilizing Reinforcement Learning in Differentiable Multiphysics Simulation

Eliot Xing, Vernon Luk, Jean Oh

ICLR 2025posterarXiv:2412.12089
11
citations
#2614

MolParser: End-to-end Visual Recognition of Molecule Structures in the Wild

Xi Fang, Jiankun Wang, Xiaochen Cai et al.

ICCV 2025posterarXiv:2411.11098
11
citations
#2615

Do-PFN: In-Context Learning for Causal Effect Estimation

Jake Robertson, Arik Reuter, Siyuan Guo et al.

NEURIPS 2025spotlightarXiv:2506.06039
11
citations
#2616

Adapter Merging with Centroid Prototype Mapping for Scalable Class-Incremental Learning

Takuma Fukuda, Hiroshi Kera, Kazuhiko Kawamoto

CVPR 2025posterarXiv:2412.18219
11
citations
#2617

GenPC: Zero-shot Point Cloud Completion via 3D Generative Priors

An Li, Zhe Zhu, Mingqiang Wei

CVPR 2025posterarXiv:2502.19896
11
citations
#2618

Diff-Shadow: Global-guided Diffusion Model for Shadow Removal

Jinting Luo, Ru Li, Chengzhi Jiang et al.

AAAI 2025paperarXiv:2407.16214
11
citations
#2619

Optimized Multi-Token Joint Decoding With Auxiliary Model for LLM Inference

Zongyue Qin, Ziniu Hu, Zifan He et al.

ICLR 2025posterarXiv:2407.09722
11
citations
#2620

NeRFPrior: Learning Neural Radiance Field as a Prior for Indoor Scene Reconstruction

Wenyuan Zhang, Emily Yue-ting Jia, Junsheng Zhou et al.

CVPR 2025highlightarXiv:2503.18361
11
citations
#2621

The Computational Complexity of Circuit Discovery for Inner Interpretability

Federico Adolfi, Martina G. Vilas, Todd Wareham

ICLR 2025posterarXiv:2410.08025
11
citations
#2622

3DGraphLLM: Combining Semantic Graphs and Large Language Models for 3D Scene Understanding

Tatiana Zemskova, Dmitry Yudin

ICCV 2025posterarXiv:2412.18450
11
citations
#2623

Tri-Ergon: Fine-Grained Video-to-Audio Generation with Multi-Modal Conditions and LUFS Control

Bingliang Li, Fengyu Yang, Yuxin Mao et al.

AAAI 2025paperarXiv:2412.20378
11
citations
#2624

GaussianUDF: Inferring Unsigned Distance Functions through 3D Gaussian Splatting

Shujuan Li, Yu-Shen Liu, Zhizhong Han

CVPR 2025highlightarXiv:2503.19458
11
citations
#2625

EmotiCrafter: Text-to-Emotional-Image Generation based on Valence-Arousal Model

Shengqi Dang, Yi He, Long Ling et al.

ICCV 2025posterarXiv:2501.05710
11
citations
#2626

Addax: Utilizing Zeroth-Order Gradients to Improve Memory Efficiency and Performance of SGD for Fine-Tuning Language Models

Zeman Li, Xinwei Zhang, Peilin Zhong et al.

ICLR 2025posterarXiv:2410.06441
11
citations
#2627

ICT: Image-Object Cross-Level Trusted Intervention for Mitigating Object Hallucination in Large Vision-Language Models

Junzhe Chen, Tianshu Zhang, Shiyu Huang et al.

CVPR 2025posterarXiv:2411.15268
11
citations
#2628

Breaking Latent Prior Bias in Detectors for Generalizable AIGC Image Detection

Yue Zhou, Xinan He, Kaiqing Lin et al.

NEURIPS 2025posterarXiv:2506.00874
11
citations
#2629

TweedieMix: Improving Multi-Concept Fusion for Diffusion-based Image/Video Generation

Gihyun Kwon, Jong Chul YE

ICLR 2025posterarXiv:2410.05591
11
citations
#2630

When the Future Becomes the Past: Taming Temporal Correspondence for Self-supervised Video Representation Learning

Yang Liu, Qianqian Xu, Peisong Wen et al.

CVPR 2025posterarXiv:2503.15096
11
citations
#2631

Interaction Asymmetry: A General Principle for Learning Composable Abstractions

Jack Brady, Julius von Kügelgen, Sebastien Lachapelle et al.

ICLR 2025posterarXiv:2411.07784
11
citations
#2632

CodeSteer: Symbolic-Augmented Language Models via Code/Text Guidance

Yongchao Chen, Yilun Hao, Yueying Liu et al.

ICML 2025posterarXiv:2502.04350
11
citations
#2633

Learning the RoPEs: Better 2D and 3D Position Encodings with STRING

Connor Schenck, Isaac Reid, Mithun Jacob et al.

ICML 2025spotlightarXiv:2502.02562
11
citations
#2634

MetaUrban: An Embodied AI Simulation Platform for Urban Micromobility

Wayne Wu, Honglin He, Jack He et al.

ICLR 2025posterarXiv:2407.08725
11
citations
#2635

TopoNets: High performing vision and language models with brain-like topography

Mayukh Deb, Mainak Deb, Apurva Murty

ICLR 2025posterarXiv:2501.16396
11
citations
#2636

DriveGEN: Generalized and Robust 3D Detection in Driving via Controllable Text-to-Image Diffusion Generation

Hongbin Lin, Zilu Guo, Yifan Zhang et al.

CVPR 2025posterarXiv:2503.11122
11
citations
#2637

EVolSplat: Efficient Volume-based Gaussian Splatting for Urban View Synthesis

Sheng Miao, Jiaxin Huang, Dongfeng Bai et al.

CVPR 2025posterarXiv:2503.20168
11
citations
#2638

Federated Learning with Sample-level Client Drift Mitigation

Haoran Xu, Jiaze Li, Wanyi Wu et al.

AAAI 2025paperarXiv:2501.11360
11
citations
#2639

XTrack: Multimodal Training Boosts RGB-X Video Object Trackers

Yuedong Tan, Zongwei Wu, Yuqian Fu et al.

ICCV 2025posterarXiv:2405.17773
11
citations
#2640

This Time is Different: An Observability Perspective on Time Series Foundation Models

Ben Cohen, Emaad Khwaja, Youssef Doubli et al.

NEURIPS 2025posterarXiv:2505.14766
11
citations
#2641

BlockDance: Reuse Structurally Similar Spatio-Temporal Features to Accelerate Diffusion Transformers

Hui Zhang, Tingwei Gao, Jie Shao et al.

CVPR 2025posterarXiv:2503.15927
11
citations
#2642

LBM: Latent Bridge Matching for Fast Image-to-Image Translation

Clément Chadebec, Onur Tasar, Sanjeev Sreetharan et al.

ICCV 2025highlightarXiv:2503.07535
11
citations
#2643

NoT: Federated Unlearning via Weight Negation

Yasser Khalil, Leo Maxime Brunswic, Soufiane Lamghari et al.

CVPR 2025posterarXiv:2503.05657
11
citations
#2644

KTAE: A Model-Free Algorithm to Key-Tokens Advantage Estimation in Mathematical Reasoning

Wei Sun, Wen Yang, Pu Jian et al.

NEURIPS 2025posterarXiv:2505.16826
11
citations
#2645

DELTA: Pre-Train a Discriminative Encoder for Legal Case Retrieval via Structural Word Alignment

Haitao Li, Qingyao Ai, Xinyan Han et al.

AAAI 2025paperarXiv:2403.18435
11
citations
#2646

Planning in the Dark: LLM-Symbolic Planning Pipeline Without Experts

Sukai Huang, Nir Lipovetzky, Trevor Cohn

AAAI 2025paperarXiv:2409.15915
11
citations
#2647

RFWave: Multi-band Rectified Flow for Audio Waveform Reconstruction

Peng Liu, Dongyang Dai, Zhiyong Wu

ICLR 2025posterarXiv:2403.05010
11
citations
#2648

Horizon-GS: Unified 3D Gaussian Splatting for Large-Scale Aerial-to-Ground Scenes

Lihan Jiang, Kerui Ren, Mulin Yu et al.

CVPR 2025posterarXiv:2412.01745
11
citations
#2649

Locality Alignment Improves Vision-Language Models

Ian Covert, Tony Sun, James Y Zou et al.

ICLR 2025posterarXiv:2410.11087
11
citations
#2650

SLIP: Spoof-Aware One-Class Face Anti-Spoofing with Language Image Pretraining

Pei-Kai Huang, Jun-Xiong Chong, Cheng-Hsuan Chiang et al.

AAAI 2025paperarXiv:2503.19982
11
citations
#2651

Enhancing Trustworthiness of Graph Neural Networks with Rank-Based Conformal Training

Ting Wang, Zhixin Zhou, Rui Luo

AAAI 2025paperarXiv:2501.02767
11
citations
#2652

KnowPO: Knowledge-Aware Preference Optimization for Controllable Knowledge Selection in Retrieval-Augmented Language Models

Ruizhe Zhang, Yongxin Xu, Yuzhen Xiao et al.

AAAI 2025paperarXiv:2408.03297
11
citations
#2653

Don’t Think Longer, Think Wisely: Optimizing Thinking Dynamics for Large Reasoning Models

Sohyun An, Ruochen Wang, Tianyi Zhou et al.

NEURIPS 2025poster
11
citations
#2654

h4rm3l: A Language for Composable Jailbreak Attack Synthesis

Moussa Koulako Bala Doumbouya, Ananjan Nandi, Gabriel Poesia et al.

ICLR 2025posterarXiv:2408.04811
11
citations
#2655

CoRe: Benchmarking LLMs’ Code Reasoning Capabilities through Static Analysis Tasks

Danning Xie, Mingwei Zheng, Xuwei Liu et al.

NEURIPS 2025spotlightarXiv:2507.05269
11
citations
#2656

Long-Sequence Recommendation Models Need Decoupled Embeddings

Ningya Feng, Junwei Pan, Jialong Wu et al.

ICLR 2025posterarXiv:2410.02604
11
citations
#2657

Data Synthesis with Diverse Styles for Face Recognition via 3DMM-Guided Diffusion

Yuxi Mi, Zhizhou Zhong, Yuge Huang et al.

CVPR 2025posterarXiv:2504.00430
11
citations
#2658

FSFM: A Generalizable Face Security Foundation Model via Self-Supervised Facial Representation Learning

Gaojian Wang, Feng Lin, Tong Wu et al.

CVPR 2025posterarXiv:2412.12032
11
citations
#2659

RealMath: A Continuous Benchmark for Evaluating Language Models on Research-Level Mathematics

Jie Zhang, Cezara Petrui, Kristina Nikolić et al.

NEURIPS 2025posterarXiv:2505.12575
11
citations
#2660

Weak-for-Strong: Training Weak Meta-Agent to Harness Strong Executors

Fan Nie, Lan Feng, Haotian Ye et al.

COLM 2025paperarXiv:2504.04785
11
citations
#2661

Multi-view Reconstruction via SfM-guided Monocular Depth Estimation

Haoyu Guo, He Zhu, Sida Peng et al.

CVPR 2025posterarXiv:2503.14483
11
citations
#2662

TOP-ERL: Transformer-based Off-Policy Episodic Reinforcement Learning

Ge Li, Dong Tian, Hongyi Zhou et al.

ICLR 2025oralarXiv:2410.09536
11
citations
#2663

FAMNet: Frequency-aware Matching Network for Cross-domain Few-shot Medical Image Segmentation

Yuntian Bo, Yazhou Zhu, Lunbo Li et al.

AAAI 2025paperarXiv:2412.09319
11
citations
#2664

LoRASculpt: Sculpting LoRA for Harmonizing General and Specialized Knowledge in Multimodal Large Language Models

Jian Liang, Wenke Huang, Guancheng Wan et al.

CVPR 2025posterarXiv:2503.16843
11
citations
#2665

Jigsaw Puzzles: Splitting Harmful Questions to Jailbreak Large Language Models in Multi-turn Interactions

Hao Yang, Lizhen Qu, Ehsan Shareghi et al.

COLM 2025paper
11
citations
#2666

Neither Valid nor Reliable? Investigating the Use of LLMs as Judges

Khaoula Chehbouni, Mohammed Haddou, Jackie CK Cheung et al.

NEURIPS 2025posterarXiv:2508.18076
11
citations
#2667

Adaptive Prompting for Continual Relation Extraction: A Within-Task Variance Perspective

Minh Le, Tien Ngoc Luu, An Nguyen The et al.

AAAI 2025paperarXiv:2412.08285
11
citations
#2668

What's the Move? Hybrid Imitation Learning via Salient Points

Priya Sundaresan, Hengyuan Hu, Quan Vuong et al.

ICLR 2025posterarXiv:2412.05426
11
citations
#2669

FocalCodec: Low-Bitrate Speech Coding via Focal Modulation Networks

Luca Della Libera, Francesco Paissan, Cem Subakan et al.

NEURIPS 2025posterarXiv:2502.04465
11
citations
#2670

Sports-Traj: A Unified Trajectory Generation Model for Multi-Agent Movement in Sports

Yi Xu, Yun Fu

ICLR 2025oralarXiv:2405.17680
11
citations
#2671

VideoScene: Distilling Video Diffusion Model to Generate 3D Scenes in One Step

Hanyang Wang, Fangfu Liu, Jiawei Chi et al.

CVPR 2025highlightarXiv:2504.01956
11
citations
#2672

Mask-Adapter: The Devil is in the Masks for Open-Vocabulary Segmentation

Yongkang Li, Tianheng Cheng, Bin Feng et al.

CVPR 2025posterarXiv:2412.04533
11
citations
#2673

Learning 2D Invariant Affordance Knowledge for 3D Affordance Grounding

Xianqiang Gao, Pingrui Zhang, Delin Qu et al.

AAAI 2025paperarXiv:2408.13024
11
citations
#2674

BlobGEN-Vid: Compositional Text-to-Video Generation with Blob Video Representations

Weixi Feng, Chao Liu, Sifei Liu et al.

CVPR 2025posterarXiv:2501.07647
11
citations
#2675

Lightweight Neural App Control

Filippos Christianos, Georgios Papoudakis, Thomas Coste et al.

ICLR 2025posterarXiv:2410.17883
11
citations
#2676

Zero-Shot Low-Light Image Enhancement via Latent Diffusion Models

Yan Huang, Xiaoshan Liao, Jinxiu Liang et al.

AAAI 2025paper
11
citations
#2677

VTDexManip: A Dataset and Benchmark for Visual-tactile Pretraining and Dexterous Manipulation with Reinforcement Learning

Qingtao Liu, Yu Cui, Zhengnan Sun et al.

ICLR 2025poster
11
citations
#2678

From Words to Worth: Newborn Article Impact Prediction with LLM

Penghai Zhao, Qinghua Xing, Kairan Dou et al.

AAAI 2025paperarXiv:2408.03934
11
citations
#2679

AI-Researcher: Autonomous Scientific Innovation

Jiabin Tang, Lianghao Xia, Zhonghang Li et al.

NEURIPS 2025spotlightarXiv:2505.18705
11
citations
#2680

Hierarchical Autoregressive Transformers: Combining Byte- and Word-Level Processing for Robust, Adaptable Language Models

Pit Neitemeier, Björn Deiseroth, Constantin Eichenberg et al.

ICLR 2025posterarXiv:2501.10322
11
citations
#2681

Beyond Training: Dynamic Token Merging for Zero-Shot Video Understanding

Yiming Zhang, Zhuokai Zhao, Zhaorun Chen et al.

ICCV 2025posterarXiv:2411.14401
11
citations
#2682

Revisiting Tampered Scene Text Detection in the Era of Generative AI

Chenfan Qu, Yiwu Zhong, Fengjun Guo et al.

AAAI 2025paperarXiv:2407.21422
11
citations
#2683

Lifting Motion to the 3D World via 2D Diffusion

Jiaman Li, Karen Liu, Jiajun Wu

CVPR 2025highlightarXiv:2411.18808
11
citations
#2684

Understanding Emotional Body Expressions via Large Language Models

Haifeng Lu, Jiuyi Chen, Feng Liang et al.

AAAI 2025paperarXiv:2412.12581
11
citations
#2685

SOLAMI: Social Vision-Language-Action Modeling for Immersive Interaction with 3D Autonomous Characters

Jianping Jiang, Weiye Xiao, Zhengyu Lin et al.

CVPR 2025posterarXiv:2412.00174
11
citations
#2686

LiDAR-RT: Gaussian-based Ray Tracing for Dynamic LiDAR Re-simulation

Chenxu Zhou, Lvchang Fu, Sida Peng et al.

CVPR 2025posterarXiv:2412.15199
11
citations
#2687

DifIISR: A Diffusion Model with Gradient Guidance for Infrared Image Super-Resolution

Xingyuan Li, Zirui Wang, Yang Zou et al.

CVPR 2025posterarXiv:2503.01187
11
citations
#2688

RaCFormer: Towards High-Quality 3D Object Detection via Query-based Radar-Camera Fusion

Xiaomeng Chu, Jiajun Deng, Guoliang You et al.

CVPR 2025posterarXiv:2412.12725
11
citations
#2689

Taming Video Diffusion Prior with Scene-Grounding Guidance for 3D Gaussian Splatting from Sparse Inputs

Yingji Zhong, Zhihao Li, Dave Zhenyu Chen et al.

CVPR 2025highlightarXiv:2503.05082
11
citations
#2690

IgGM: A Generative Model for Functional Antibody and Nanobody Design

Rubo Wang, Fandi Wu, Xingyu Gao et al.

ICLR 2025poster
11
citations
#2691

Toward Real-world BEV Perception: Depth Uncertainty Estimation via Gaussian Splatting

Shu-Wei Lu, Yi-Hsuan Tsai, Yi-Ting Chen

CVPR 2025posterarXiv:2504.01957
11
citations
#2692

Deep Kernel Relative Test for Machine-generated Text Detection

Yiliao Song, Zhenqiao Yuan, Shuhai Zhang et al.

ICLR 2025poster
11
citations
#2693

DisasterM3: A Remote Sensing Vision-Language Dataset for Disaster Damage Assessment and Response

Junjue Wang, Weihao Xuan, Heli Qi et al.

NEURIPS 2025oralarXiv:2505.21089
11
citations
#2694

Audio-Visual Instance Segmentation

Ruohao Guo, Xianghua Ying, Yaru Chen et al.

CVPR 2025posterarXiv:2310.18709
11
citations
#2695

Sparse Autoencoders Reveal Temporal Difference Learning in Large Language Models

Can Demircan, Tankred Saanum, Akshay Jagadish et al.

ICLR 2025oralarXiv:2410.01280
11
citations
#2696

GeoLLaVA-8K: Scaling Remote-Sensing Multimodal Large Language Models to 8K Resolution

Fengxiang Wang, Mingshuo Chen, Yueying Li et al.

NEURIPS 2025spotlightarXiv:2505.21375
11
citations
#2697

Integrated Augmented and Virtual Reality Technologies for Realistic Fire Drill Training

Hosan Kang, Jinseong Yang, Beom-Seok Ko et al.

ISMAR 2025paper
11
citations
#2698

MIDI-GPT: A Controllable Generative Model for Computer-Assisted Multitrack Music Composition

Philippe Pasquier, Jeff Ens, Nathan Fradet et al.

AAAI 2025paperarXiv:2501.17011
11
citations
#2699

FluidNexus: 3D Fluid Reconstruction and Prediction from a Single Video

Yue Gao, Hong-Xing Yu, Bo Zhu et al.

CVPR 2025posterarXiv:2503.04720
11
citations
#2700

Transformer-Squared: Self-adaptive LLMs

Qi Sun, Edoardo Cetin, Yujin Tang

ICLR 2025posterarXiv:2501.06252
11
citations
#2701

Skill Expansion and Composition in Parameter Space

Tenglong Liu, Jianxiong Li, Yinan Zheng et al.

ICLR 2025posterarXiv:2502.05932
11
citations
#2702

Selective Visual Prompting in Vision Mamba

Yifeng Yao, Zichen Liu, Zhenyu Cui et al.

AAAI 2025paperarXiv:2412.08947
11
citations
#2703

Modeling Fine-Grained Hand-Object Dynamics for Egocentric Video Representation Learning

Baoqi Pei, Yifei Huang, Jilan Xu et al.

ICLR 2025posterarXiv:2503.00986
11
citations
#2704

Epistemic Alignment: A Mediating Framework for User-LLM Knowledge Delivery

Nicholas Clark, Hua Shen, Bill Howe et al.

COLM 2025paperarXiv:2504.01205
11
citations
#2705

BearLLM: A Prior Knowledge-Enhanced Bearing Health Management Framework with Unified Vibration Signal Representation

Haotian Peng, Jiawei Liu, Jinsong Du et al.

AAAI 2025paperarXiv:2408.11281
11
citations
#2706

SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories

Muzhi Zhu, Yuzhuo Tian, Hao Chen et al.

CVPR 2025posterarXiv:2503.08625
11
citations
#2707

nnWNet: Rethinking the Use of Transformers in Biomedical Image Segmentation and Calling for a Unified Evaluation Benchmark

Yanfeng Zhou, Lingrui Li, Le Lu et al.

CVPR 2025poster
11
citations
#2708

LaGeM: A Large Geometry Model for 3D Representation Learning and Diffusion

Biao Zhang, Peter Wonka

ICLR 2025posterarXiv:2410.01295
11
citations
#2709

OmniStyle: Filtering High Quality Style Transfer Data at Scale

Ye Wang, Ruiqi Liu, Jiang Lin et al.

CVPR 2025posterarXiv:2505.14028
11
citations
#2710

Towards Optimal Multi-draft Speculative Decoding

Zhengmian Hu, Tong Zheng, Vignesh Viswanathan et al.

ICLR 2025posterarXiv:2502.18779
11
citations
#2711

Latent Chain-of-Thought for Visual Reasoning

Guohao Sun, Hang Hua, Jian Wang et al.

NEURIPS 2025posterarXiv:2510.23925
11
citations
#2712

Almost Optimal Batch-Regret Tradeoff for Batch Linear Contextual Bandits

Zihan Zhang, Xiangyang Ji, Yuan Zhou

ICLR 2025posterarXiv:2110.08057
11
citations
#2713

Geometry of Lightning Self-Attention: Identifiability and Dimension

Nathan Henry, Giovanni Luca Marchetti, Kathlén Kohn

ICLR 2025posterarXiv:2408.17221
11
citations
#2714

Tra-MoE: Learning Trajectory Prediction Model from Multiple Domains for Adaptive Policy Conditioning

Jiange Yang, Haoyi Zhu, Yating Wang et al.

CVPR 2025posterarXiv:2411.14519
11
citations
#2715

On Linear Representations and Pretraining Data Frequency in Language Models

Jack Merullo, Noah Smith, Sarah Wiegreffe et al.

ICLR 2025posterarXiv:2504.12459
11
citations
#2716

Detecting Backdoor Attacks in Federated Learning via Direction Alignment Inspection

Jiahao Xu, Zikai Zhang, Rui Hu

CVPR 2025highlightarXiv:2503.07978
11
citations
#2717

OmniCount: Multi-label Object Counting with Semantic-Geometric Priors

Anindya Mondal, Sauradip Nag, Xiatian Zhu et al.

AAAI 2025paperarXiv:2403.05435
11
citations
#2718

MPTSNet: Integrating Multiscale Periodic Local Patterns and Global Dependencies for Multivariate Time Series Classification

Yang Mu, Muhammad Shahzad, Xiao Xiang Zhu

AAAI 2025paperarXiv:2503.05582
11
citations
#2719

NetMoE: Accelerating MoE Training through Dynamic Sample Placement

Xinyi Liu, Yujie Wang, Fangcheng Fu et al.

ICLR 2025poster
11
citations
#2720

Conformalized Interval Arithmetic with Symmetric Calibration

Rui Luo, Zhixin Zhou

AAAI 2025paperarXiv:2408.10939
11
citations
#2721

Adversarial Machine Unlearning

Zonglin Di, Sixie Yu, Yevgeniy Vorobeychik et al.

ICLR 2025posterarXiv:2406.07687
11
citations
#2722

Conformal Thresholded Intervals for Efficient Regression

Rui Luo, Zhixin Zhou

AAAI 2025paperarXiv:2407.14495
11
citations
#2723

Identifying and Mitigating Position Bias of Multi-image Vision-Language Models

Xinyu Tian, Shu Zou, Zhaoyuan Yang et al.

CVPR 2025posterarXiv:2503.13792
11
citations
#2724

Bongard in Wonderland: Visual Puzzles that Still Make AI Go Mad?

Antonia Wüst, Tim Woydt, Lukas Helff et al.

ICML 2025posterarXiv:2410.19546
11
citations
#2725

ViSpeak: Visual Instruction Feedback in Streaming Videos

Shenghao Fu, Qize Yang, Yuan-Ming Li et al.

ICCV 2025posterarXiv:2503.12769
11
citations
#2726

GaussianSpa: An “Optimizing-Sparsifying” Simplification Framework for Compact and High-Quality 3D Gaussian Splatting

Yangming Zhang, Wenqi Jia, Wei Niu et al.

CVPR 2025posterarXiv:2411.06019
11
citations
#2727

Rectified Diffusion Guidance for Conditional Generation

Mengfei Xia, Nan Xue, Yujun Shen et al.

CVPR 2025posterarXiv:2410.18737
11
citations
#2728

Advancing Mathematical Reasoning in Language Models: The Impact of Problem-Solving Data, Data Synthesis Methods, and Training Stages

Zui Chen, Tianqiao Liu, Tongqing et al.

ICLR 2025posterarXiv:2501.14002
11
citations
#2729

From Commands to Prompts: LLM-based Semantic File System for AIOS

Zeru Shi, Kai Mei, Mingyu Jin et al.

ICLR 2025posterarXiv:2410.11843
11
citations
#2730

R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing

Tianyu Fu, Yi Ge, Yichen You et al.

NEURIPS 2025posterarXiv:2505.21600
11
citations
#2731

RelitLRM: Generative Relightable Radiance for Large Reconstruction Models

Tianyuan Zhang, Zhengfei Kuang, Haian Jin et al.

ICLR 2025posterarXiv:2410.06231
11
citations
#2732

LoRA3D: Low-Rank Self-Calibration of 3D Geometric Foundation models

Ziqi Lu, Heng Yang, Danfei Xu et al.

ICLR 2025posterarXiv:2412.07746
11
citations
#2733

Envisioning Class Entity Reasoning by Large Language Models for Few-shot Learning

Mushui Liu, Fangtai Wu, Bozheng Li et al.

AAAI 2025paperarXiv:2408.12469
11
citations
#2734

Audio-visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head Generation

Fating Hong, Zunnan Xu, Zixiang Zhou et al.

ICCV 2025posterarXiv:2504.02542
11
citations
#2735

LiveXiv - A Multi-Modal live benchmark based on Arxiv papers content

Nimrod Shabtay, Felipe Maia Polo, Sivan Doveh et al.

ICLR 2025posterarXiv:2410.10783
11
citations
#2736

DG-Mamba: Robust and Efficient Dynamic Graph Structure Learning with Selective State Space Models

Haonan Yuan, Qingyun Sun, Zhaonan Wang et al.

AAAI 2025paperarXiv:2412.08160
11
citations
#2737

Proxy Denoising for Source-Free Domain Adaptation

Song Tang, Wenxin Su, Yan Gan et al.

ICLR 2025posterarXiv:2406.01658
11
citations
#2738

Touch in the Wild: Learning Fine-Grained Manipulation with a Portable Visuo-Tactile Gripper

Xinyue Zhu, Binghao Huang, Yunzhu Li

NEURIPS 2025posterarXiv:2507.15062
11
citations
#2739

Efficiently Democratizing Medical LLMs for 50 Languages via a Mixture of Language Family Experts

Guorui Zheng, Xidong Wang, Juhao Liang et al.

ICLR 2025posterarXiv:2410.10626
11
citations
#2740

Learning Few-Step Diffusion Models by Trajectory Distribution Matching

Yihong Luo, Tianyang Hu, Jiacheng Sun et al.

ICCV 2025posterarXiv:2503.06674
11
citations
#2741

SF2T: Self-supervised Fragment Finetuning of Video-LLMs for Fine-Grained Understanding

Yangliu Hu, Zikai Song, Na Feng et al.

CVPR 2025posterarXiv:2504.07745
11
citations
#2742

Latent Space Super-Resolution for Higher-Resolution Image Generation with Diffusion Models

Jinho Jeong, Sangmin Han, Jinwoo Kim et al.

CVPR 2025posterarXiv:2503.18446
11
citations
#2743

Efficient Rectification of Neuro-Symbolic Reasoning Inconsistencies by Abductive Reflection

Wen-Chao Hu, Wang-Zhou Dai, Yuan Jiang et al.

AAAI 2025paperarXiv:2412.08457
11
citations
#2744

Bridging the Gap for Test-Time Multimodal Sentiment Analysis

Zirun Guo, Tao Jin, Wenlong Xu et al.

AAAI 2025paperarXiv:2412.07121
11
citations
#2745

Preference Optimization on Pareto Sets: On a Theory of Multi-Objective Optimization

Abhishek Roy, Geelon So, Yian Ma

NEURIPS 2025poster
11
citations
#2746

Reangle-A-Video: 4D Video Generation as Video-to-Video Translation

Hyeonho Jeong, Suhyeon Lee, Jong Ye

ICCV 2025posterarXiv:2503.09151
11
citations
#2747

SeaS: Few-shot Industrial Anomaly Image Generation with Separation and Sharing Fine-tuning

Zhewei Dai, Shilei Zeng, Haotian Liu et al.

ICCV 2025posterarXiv:2410.14987
11
citations
#2748

Synthetic Video Enhances Physical Fidelity in Video Synthesis

Qi Zhao, Xingyu Ni, Ziyu Wang et al.

ICCV 2025posterarXiv:2503.20822
11
citations
#2749

Battling the Non-stationarity in Time Series Forecasting via Test-time Adaptation

HyunGi Kim, Siwon Kim, Jisoo Mok et al.

AAAI 2025paperarXiv:2501.04970
11
citations
#2750

RoboFactory: Exploring Embodied Agent Collaboration with Compositional Constraints

Yiran Qin, Li Kang, Xiufeng Song et al.

ICCV 2025posterarXiv:2503.16408
11
citations
#2751

HashAttention: Semantic Sparsity for Faster Inference

Aditya Desai, Shuo Yang, Alejandro Cuadron et al.

ICML 2025posterarXiv:2412.14468
11
citations
#2752

VQ4DiT: Efficient Post-Training Vector Quantization for Diffusion Transformers

Juncan Deng, Shuaiting Li, Zeyu Wang et al.

AAAI 2025paperarXiv:2408.17131
11
citations
#2753

Benign Samples Matter! Fine-tuning On Outlier Benign Samples Severely Breaks Safety

Zihan Guan, Mengxuan Hu, Ronghang Zhu et al.

ICML 2025spotlightarXiv:2505.06843
11
citations
#2754

Knowledge Localization: Mission Not Accomplished? Enter Query Localization!

Yuheng Chen, Pengfei Cao, Yubo Chen et al.

ICLR 2025posterarXiv:2405.14117
11
citations
#2755

Liger: Linearizing Large Language Models to Gated Recurrent Structures

Disen Lan, Weigao Sun, Jiaxi Hu et al.

ICML 2025posterarXiv:2503.01496
11
citations
#2756

DenseDPO: Fine-Grained Temporal Preference Optimization for Video Diffusion Models

Ziyi Wu, Anil Kag, Ivan Skorokhodov et al.

NEURIPS 2025oralarXiv:2506.03517
11
citations
#2757

ASAP: Advancing Semantic Alignment Promotes Multi-Modal Manipulation Detecting and Grounding

Zhenxing Zhang, Yaxiong Wang, Lechao Cheng et al.

CVPR 2025posterarXiv:2412.12718
11
citations
#2758

4DGC: Rate-Aware 4D Gaussian Compression for Efficient Streamable Free-Viewpoint Video

Qiang Hu, Zihan Zheng, Houqiang Zhong et al.

CVPR 2025posterarXiv:2503.18421
11
citations
#2759

FALCON: Resolving Visual Redundancy and Fragmentation in High-resolution Multimodal Large Language Models via Visual Registers

Renshan Zhang, Rui Shao, Gongwei Chen et al.

ICCV 2025posterarXiv:2501.16297
11
citations
#2760

Jailbreaking as a Reward Misspecification Problem

Zhihui Xie, Jiahui Gao, Lei Li et al.

ICLR 2025posterarXiv:2406.14393
11
citations
#2761

ProSec: Fortifying Code LLMs with Proactive Security Alignment

Xiangzhe Xu, Zian Su, Jinyao Guo et al.

ICML 2025posterarXiv:2411.12882
11
citations
#2762

Towards Trustworthy Knowledge Graph Reasoning: An Uncertainty Aware Perspective

Bo Ni, Yu Wang, Lu Cheng et al.

AAAI 2025paperarXiv:2410.08985
11
citations
#2763

RoboTron-Drive: All-in-One Large Multimodal Model for Autonomous Driving

Zhijian Huang, Chengjian Feng, Baihui Xiao et al.

ICCV 2025posterarXiv:2412.07689
11
citations
#2764

ROCKET-1: Mastering Open-World Interaction with Visual-Temporal Context Prompting

Shaofei Cai, Zihao Wang, Kewei Lian et al.

CVPR 2025posterarXiv:2410.17856
11
citations
#2765

The Lottery LLM Hypothesis, Rethinking What Abilities Should LLM Compression Preserve?

Zhenheng Tang, Xiang Liu, Qian Wang et al.

ICLR 2025posterarXiv:2502.17535
11
citations
#2766

Ultra-Sparse Memory Network

Zihao Huang, Qiyang Min, Hongzhi Huang et al.

ICLR 2025posterarXiv:2411.12364
11
citations
#2767

Bridging the User-side Knowledge Gap in Knowledge-aware Recommendations with Large Language Models

Zheng Hu, Zhe Li, Ziyun Jiao et al.

AAAI 2025paperarXiv:2412.13544
11
citations
#2768

Hidden in the Noise: Two-Stage Robust Watermarking for Images

Kasra Arabi, Benjamin Feuer, R. Teal Witter et al.

ICLR 2025posterarXiv:2412.04653
11
citations
#2769

HEIE: MLLM-Based Hierarchical Explainable AIGC Image Implausibility Evaluator

Fan Yang, Ru Zhen, Jianing Wang et al.

CVPR 2025posterarXiv:2411.17261
11
citations
#2770

Precise Localization of Memories: A Fine-grained Neuron-level Knowledge Editing Technique for LLMs

Haowen Pan, Xiaozhi Wang, Yixin Cao et al.

ICLR 2025posterarXiv:2503.01090
11
citations
#2771

Unifying Appearance Codes and Bilateral Grids for Driving Scene Gaussian Splatting

Nan Wang, Lixing Xiao, Yuantao Chen et al.

NEURIPS 2025posterarXiv:2506.05280
11
citations
#2772

Embodied VideoAgent: Persistent Memory from Egocentric Videos and Embodied Sensors Enables Dynamic Scene Understanding

Yue Fan, Xiaojian Ma, Rongpeng Su et al.

ICCV 2025highlightarXiv:2501.00358
11
citations
#2773

STD-PLM: Understanding Both Spatial and Temporal Properties of Spatial-Temporal Data with PLM

Yiheng Huang, Xiaowei Mao, Shengnan Guo et al.

AAAI 2025paperarXiv:2407.09096
11
citations
#2774

Jailbreak Antidote: Runtime Safety-Utility Balance via Sparse Representation Adjustment in Large Language Models

Guobin Shen, Dongcheng Zhao, Yiting Dong et al.

ICLR 2025posterarXiv:2410.02298
11
citations
#2775

TaoAvatar: Real-Time Lifelike Full-Body Talking Avatars for Augmented Reality via 3D Gaussian Splatting

Jianchuan Chen, Jingchuan Hu, Gaige Wang et al.

CVPR 2025highlightarXiv:2503.17032
11
citations
#2776

From Words to Structured Visuals: A Benchmark and Framework for Text-to-Diagram Generation and Editing

Jingxuan Wei, Cheng Tan, Qi Chen et al.

CVPR 2025highlightarXiv:2411.11916
11
citations
#2777

Node Identifiers: Compact, Discrete Representations for Efficient Graph Learning

Yuankai Luo, Hongkang Li, Qijiong Liu et al.

ICLR 2025posterarXiv:2405.16435
11
citations
#2778

Differentiable Optimization of Similarity Scores Between Models and Brains

Nathan Cloos, Moufan Li, Markus Siegel et al.

ICLR 2025posterarXiv:2407.07059
11
citations
#2779

Frame Order Matters: A Temporal Sequence-Aware Model for Few-Shot Action Recognition

Bozheng Li, Mushui Liu, Gaoang Wang et al.

AAAI 2025paperarXiv:2408.12475
11
citations
#2780

ExpertAF: Expert Actionable Feedback from Video

Kumar Ashutosh, Tushar Nagarajan, Georgios Pavlakos et al.

CVPR 2025posterarXiv:2408.00672
11
citations
#2781

VladVA: Discriminative Fine-tuning of LVLMs

Yassine Ouali, Adrian Bulat, ALEXANDROS XENOS et al.

CVPR 2025posterarXiv:2412.04378
11
citations
#2782

Manifold Learning by Mixture Models of VAEs for Inverse Problems

Giovanni S. Alberti, Johannes Hertrich, Matteo Santacesaria et al.

ICLR 2025posterarXiv:2303.15244
11
citations
#2783

Learning Robust Stereo Matching in the Wild with Selective Mixture-of-Experts

Yun Wang, Longguang Wang, Chenghao Zhang et al.

ICCV 2025highlightarXiv:2507.04631
11
citations
#2784

LLMs on the Line: Data Determines Loss-to-Loss Scaling Laws

Prasanna Mayilvahanan, Thaddäus Wiedemer, Sayak Mallick et al.

ICML 2025posterarXiv:2502.12120
11
citations
#2785

Flash-VStream: Efficient Real-Time Understanding for Long Video Streams

Haoji Zhang, Yiqin Wang, Yansong Tang et al.

ICCV 2025posterarXiv:2506.23825
11
citations
#2786

Multi-domain Distribution Learning for De Novo Drug Design

Arne Schneuing, Ilia Igashov, Adrian Dobbelstein et al.

ICLR 2025posterarXiv:2508.17815
11
citations
#2787

TopoCellGen: Generating Histopathology Cell Topology with a Diffusion Model

Meilong Xu, Saumya Gupta, Xiaoling Hu et al.

CVPR 2025posterarXiv:2412.06011
11
citations
#2788

Residual Stream Analysis with Multi-Layer SAEs

Tim Lawson, Lucy Farnik, Conor Houghton et al.

ICLR 2025posterarXiv:2409.04185
11
citations
#2789

SharpDepth: Sharpening Metric Depth Predictions Using Diffusion Distillation

Duc-Hai Pham, Tung Do, Phong Nguyen et al.

CVPR 2025posterarXiv:2411.18229
11
citations
#2790

Visual-Instructed Degradation Diffusion for All-in-One Image Restoration

Haina Qin, Wenyang Luo, Zewen Chen et al.

CVPR 2025posterarXiv:2506.16960
11
citations
#2791

Does Your Vision-Language Model Get Lost in the Long Video Sampling Dilemma?

Tianyuan Qu, Longxiang Tang, Bohao PENG et al.

ICCV 2025posterarXiv:2503.12496
11
citations
#2792

Revisiting In-context Learning Inference Circuit in Large Language Models

Hakaze Cho, Mariko Kato, Yoshihiro Sakai et al.

ICLR 2025posterarXiv:2410.04468
11
citations
#2793

MoEE: Mixture of Emotion Experts for Audio-Driven Portrait Animation

Huaize Liu, WenZhang Sun, Donglin Di et al.

CVPR 2025posterarXiv:2501.01808
11
citations
#2794

Few for Many: Tchebycheff Set Scalarization for Many-Objective Optimization

Xi Lin, Yilu Liu, Xiaoyuan Zhang et al.

ICLR 2025posterarXiv:2405.19650
11
citations
#2795

SpatialCLIP: Learning 3D-aware Image Representations from Spatially Discriminative Language

zehan wang, Sashuai zhou, Shaoxuan He et al.

CVPR 2025poster
11
citations
#2796

Deep Learning Alternatives Of The Kolmogorov Superposition Theorem

Leonardo Ferreira Guilhoto, Paris Perdikaris

ICLR 2025posterarXiv:2410.01990
11
citations
#2797

Semantic and Sequential Alignment for Referring Video Object Segmentation

Feiyu Pan, Hao Fang, Fangkai Li et al.

CVPR 2025poster
11
citations
#2798

Rethinking Invariance in In-context Learning

Lizhe Fang, Yifei Wang, Khashayar Gatmiry et al.

ICLR 2025posterarXiv:2505.04994
11
citations
#2799

Context Steering: Controllable Personalization at Inference Time

Zhiyang He, Sashrika Pandey, Mariah Schrum et al.

ICLR 2025posterarXiv:2405.01768
11
citations
#2800

CoRe: Context-Regularized Text Embedding Learning for Text-to-Image Personalization

Feize Wu, Yun Pang, Junyi Zhang et al.

AAAI 2025paperarXiv:2408.15914
11
citations