Most Cited 2025 "geneval benchmark" Papers

22,274 papers found • Page 13 of 112

#2401

From Commands to Prompts: LLM-based Semantic File System for AIOS

Zeru Shi, Kai Mei, Mingyu Jin et al.

ICLR 2025posterarXiv:2410.11843
11
citations
#2402

Adapter Merging with Centroid Prototype Mapping for Scalable Class-Incremental Learning

Takuma Fukuda, Hiroshi Kera, Kazuhiko Kawamoto

CVPR 2025posterarXiv:2412.18219
11
citations
#2403

What's the Move? Hybrid Imitation Learning via Salient Points

Priya Sundaresan, Hengyuan Hu, Quan Vuong et al.

ICLR 2025posterarXiv:2412.05426
11
citations
#2404

Instant Adversarial Purification with Adversarial Consistency Distillation

Chun Tong Lei, Hon Ming Yam, Zhongliang Guo et al.

CVPR 2025posterarXiv:2408.17064
11
citations
#2405

Understanding Virtual Nodes: Oversquashing and Node Heterogeneity

Joshua Southern, Francesco Di Giovanni, Michael Bronstein et al.

ICLR 2025posterarXiv:2405.13526
11
citations
#2406

ExpertAF: Expert Actionable Feedback from Video

Kumar Ashutosh, Tushar Nagarajan, Georgios Pavlakos et al.

CVPR 2025posterarXiv:2408.00672
11
citations
#2407

DifIISR: A Diffusion Model with Gradient Guidance for Infrared Image Super-Resolution

Xingyuan Li, Zirui Wang, Yang Zou et al.

CVPR 2025posterarXiv:2503.01187
11
citations
#2408

RFWave: Multi-band Rectified Flow for Audio Waveform Reconstruction

Peng Liu, Dongyang Dai, Zhiyong Wu

ICLR 2025posterarXiv:2403.05010
11
citations
#2409

Advancing Spiking Neural Networks Towards Multiscale Spatiotemporal Interaction Learning

Yimeng Shan, Malu Zhang, Rui-jie Zhu et al.

AAAI 2025paperarXiv:2405.13672
11
citations
#2410

Benign Samples Matter! Fine-tuning On Outlier Benign Samples Severely Breaks Safety

Zihan Guan, Mengxuan Hu, Ronghang Zhu et al.

ICML 2025spotlightarXiv:2505.06843
11
citations
#2411

From Words to Worth: Newborn Article Impact Prediction with LLM

Penghai Zhao, Qinghua Xing, Kairan Dou et al.

AAAI 2025paperarXiv:2408.03934
11
citations
#2412

Hyperbolic Fine-Tuning for Large Language Models

Menglin Yang, Ram Samarth B B, Aosong Feng et al.

NEURIPS 2025spotlightarXiv:2410.04010
11
citations
#2413

Understanding Emotional Body Expressions via Large Language Models

Haifeng Lu, Jiuyi Chen, Feng Liang et al.

AAAI 2025paperarXiv:2412.12581
11
citations
#2414

PhysCtrl: Generative Physics for Controllable and Physics-Grounded Video Generation

Chen Wang, Chuhao Chen, Yiming Huang et al.

NEURIPS 2025oralarXiv:2509.20358
11
citations
#2415

MoEE: Mixture of Emotion Experts for Audio-Driven Portrait Animation

Huaize Liu, WenZhang Sun, Donglin Di et al.

CVPR 2025posterarXiv:2501.01808
11
citations
#2416

Liger: Linearizing Large Language Models to Gated Recurrent Structures

Disen Lan, Weigao Sun, Jiaxi Hu et al.

ICML 2025posterarXiv:2503.01496
11
citations
#2417

Preference Optimization on Pareto Sets: On a Theory of Multi-Objective Optimization

Abhishek Roy, Geelon So, Yian Ma

NEURIPS 2025poster
11
citations
#2418

Integrated Augmented and Virtual Reality Technologies for Realistic Fire Drill Training

Hosan Kang, Jinseong Yang, Beom-Seok Ko et al.

ISMAR 2025paper
11
citations
#2419

SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories

Muzhi Zhu, Yuzhuo Tian, Hao Chen et al.

CVPR 2025posterarXiv:2503.08625
11
citations
#2420

LaGeM: A Large Geometry Model for 3D Representation Learning and Diffusion

Biao Zhang, Peter Wonka

ICLR 2025posterarXiv:2410.01295
11
citations
#2421

Hidden in the Noise: Two-Stage Robust Watermarking for Images

Kasra Arabi, Benjamin Feuer, R. Teal Witter et al.

ICLR 2025posterarXiv:2412.04653
11
citations
#2422

Adversarial Machine Unlearning

Zonglin Di, Sixie Yu, Yevgeniy Vorobeychik et al.

ICLR 2025posterarXiv:2406.07687
11
citations
#2423

4DGC: Rate-Aware 4D Gaussian Compression for Efficient Streamable Free-Viewpoint Video

Qiang Hu, Zihan Zheng, Houqiang Zhong et al.

CVPR 2025posterarXiv:2503.18421
11
citations
#2424

Bongard in Wonderland: Visual Puzzles that Still Make AI Go Mad?

Antonia Wüst, Tim Woydt, Lukas Helff et al.

ICML 2025posterarXiv:2410.19546
11
citations
#2425

Unifying Appearance Codes and Bilateral Grids for Driving Scene Gaussian Splatting

Nan Wang, Lixing Xiao, Yuantao Chen et al.

NEURIPS 2025posterarXiv:2506.05280
11
citations
#2426

nnWNet: Rethinking the Use of Transformers in Biomedical Image Segmentation and Calling for a Unified Evaluation Benchmark

Yanfeng Zhou, Lingrui Li, Le Lu et al.

CVPR 2025poster
11
citations
#2427

Ultra-Sparse Memory Network

Zihao Huang, Qiyang Min, Hongzhi Huang et al.

ICLR 2025posterarXiv:2411.12364
11
citations
#2428

Semantic and Sequential Alignment for Referring Video Object Segmentation

Feiyu Pan, Hao Fang, Fangkai Li et al.

CVPR 2025poster
11
citations
#2429

LLMs on the Line: Data Determines Loss-to-Loss Scaling Laws

Prasanna Mayilvahanan, Thaddäus Wiedemer, Sayak Mallick et al.

ICML 2025posterarXiv:2502.12120
11
citations
#2430

HashAttention: Semantic Sparsity for Faster Inference

Aditya Desai, Shuo Yang, Alejandro Cuadron et al.

ICML 2025posterarXiv:2412.14468
11
citations
#2431

SF2T: Self-supervised Fragment Finetuning of Video-LLMs for Fine-Grained Understanding

Yangliu Hu, Zikai Song, Na Feng et al.

CVPR 2025posterarXiv:2504.07745
11
citations
#2432

Rethinking Invariance in In-context Learning

Lizhe Fang, Yifei Wang, Khashayar Gatmiry et al.

ICLR 2025posterarXiv:2505.04994
11
citations
#2433

DenseDPO: Fine-Grained Temporal Preference Optimization for Video Diffusion Models

Ziyi Wu, Anil Kag, Ivan Skorokhodov et al.

NEURIPS 2025oralarXiv:2506.03517
11
citations
#2434

SEC-bench: Automated Benchmarking of LLM Agents on Real-World Software Security Tasks

Hwiwon Lee, Ziqi Zhang, Hanxiao Lu et al.

NEURIPS 2025posterarXiv:2506.11791
11
citations
#2435

SANER: Annotation-free Societal Attribute Neutralizer for Debiasing CLIP

Yusuke Hirota, Min-Hung Chen, Chien-Yi Wang et al.

ICLR 2025posterarXiv:2408.10202
11
citations
#2436

3DGraphLLM: Combining Semantic Graphs and Large Language Models for 3D Scene Understanding

Tatiana Zemskova, Dmitry Yudin

ICCV 2025posterarXiv:2412.18450
11
citations
#2437

RecFlow: An Industrial Full Flow Recommendation Dataset

Qi Liu, Kai Zheng, Rui Huang et al.

ICLR 2025posterarXiv:2410.20868
11
citations
#2438

HEIE: MLLM-Based Hierarchical Explainable AIGC Image Implausibility Evaluator

Fan Yang, Ru Zhen, Jianing Wang et al.

CVPR 2025posterarXiv:2411.17261
11
citations
#2439

Consistent and Controllable Image Animation with Motion Diffusion Models

Xin Ma, Yaohui Wang, Gengyun Jia et al.

CVPR 2025posterarXiv:2407.15642
11
citations
#2440

TopoCellGen: Generating Histopathology Cell Topology with a Diffusion Model

Meilong Xu, Saumya Gupta, Xiaoling Hu et al.

CVPR 2025posterarXiv:2412.06011
11
citations
#2441

Interaction Asymmetry: A General Principle for Learning Composable Abstractions

Jack Brady, Julius von Kügelgen, Sebastien Lachapelle et al.

ICLR 2025posterarXiv:2411.07784
11
citations
#2442

MetaUrban: An Embodied AI Simulation Platform for Urban Micromobility

Wayne Wu, Honglin He, Jack He et al.

ICLR 2025posterarXiv:2407.08725
11
citations
#2443

On Conformal Isometry of Grid Cells: Learning Distance-Preserving Position Embedding

Dehong Xu, Ruiqi Gao, Wenhao Zhang et al.

ICLR 2025posterarXiv:2405.16865
11
citations
#2444

Multi-step Visual Reasoning with Visual Tokens Scaling and Verification

Tianyi Bai, Zengjie Hu, Fupeng Sun et al.

NEURIPS 2025posterarXiv:2506.07235
11
citations
#2445

Distilling LLM Agent into Small Models with Retrieval and Code Tools

Minki Kang, Jongwon Jeong, Seanie Lee et al.

NEURIPS 2025spotlightarXiv:2505.17612
11
citations
#2446

Touch in the Wild: Learning Fine-Grained Manipulation with a Portable Visuo-Tactile Gripper

Xinyue Zhu, Binghao Huang, Yunzhu Li

NEURIPS 2025posterarXiv:2507.15062
11
citations
#2447

The Computational Complexity of Circuit Discovery for Inner Interpretability

Federico Adolfi, Martina G. Vilas, Todd Wareham

ICLR 2025posterarXiv:2410.08025
11
citations
#2448

Breaking Latent Prior Bias in Detectors for Generalizable AIGC Image Detection

Yue Zhou, Xinan He, Kaiqing Lin et al.

NEURIPS 2025posterarXiv:2506.00874
11
citations
#2449

RaCFormer: Towards High-Quality 3D Object Detection via Query-based Radar-Camera Fusion

Xiaomeng Chu, Jiajun Deng, Guoliang You et al.

CVPR 2025posterarXiv:2412.12725
11
citations
#2450

BadVLA: Towards Backdoor Attacks on Vision-Language-Action Models via Objective-Decoupled Optimization

Xueyang Zhou, Guiyao Tie, Guowen Zhang et al.

NEURIPS 2025posterarXiv:2505.16640
11
citations
#2451

Twilight: Adaptive Attention Sparsity with Hierarchical Top-$p$ Pruning

Chaofan Lin, Jiaming Tang, Shuo Yang et al.

NEURIPS 2025spotlightarXiv:2502.02770
11
citations
#2452

HiP-AD: Hierarchical and Multi-Granularity Planning with Deformable Attention for Autonomous Driving in a Single Decoder

Yingqi Tang, Zhuoran Xu, Zhaotie Meng et al.

ICCV 2025posterarXiv:2503.08612
11
citations
#2453

Latent Thought Models with Variational Bayes Inference-Time Computation

Deqian Kong, Minglu Zhao, Dehong Xu et al.

ICML 2025posterarXiv:2502.01567
11
citations
#2454

CoRe: Benchmarking LLMs’ Code Reasoning Capabilities through Static Analysis Tasks

Danning Xie, Mingwei Zheng, Xuwei Liu et al.

NEURIPS 2025spotlightarXiv:2507.05269
11
citations
#2455

EVolSplat: Efficient Volume-based Gaussian Splatting for Urban View Synthesis

Sheng Miao, Jiaxin Huang, Dongfeng Bai et al.

CVPR 2025posterarXiv:2503.20168
11
citations
#2456

LoRACLR: Contrastive Adaptation for Customization of Diffusion Models

Enis Simsar, Thomas Hofmann, Federico Tombari et al.

CVPR 2025posterarXiv:2412.09622
11
citations
#2457

RoboFactory: Exploring Embodied Agent Collaboration with Compositional Constraints

Yiran Qin, Li Kang, Xiufeng Song et al.

ICCV 2025posterarXiv:2503.16408
11
citations
#2458

Synthetic Video Enhances Physical Fidelity in Video Synthesis

Qi Zhao, Xingyu Ni, Ziyu Wang et al.

ICCV 2025posterarXiv:2503.20822
11
citations
#2459

NoT: Federated Unlearning via Weight Negation

Yasser Khalil, Leo Maxime Brunswic, Soufiane Lamghari et al.

CVPR 2025posterarXiv:2503.05657
11
citations
#2460

Precise Localization of Memories: A Fine-grained Neuron-level Knowledge Editing Technique for LLMs

Haowen Pan, Xiaozhi Wang, Yixin Cao et al.

ICLR 2025posterarXiv:2503.01090
11
citations
#2461

Artificial Hivemind: The Open-Ended Homogeneity of Language Models (and Beyond)

Liwei Jiang, Yuanjun Chai, Margaret Li et al.

NEURIPS 2025oralarXiv:2510.22954
11
citations
#2462

Learning the RoPEs: Better 2D and 3D Position Encodings with STRING

Connor Schenck, Isaac Reid, Mithun Jacob et al.

ICML 2025spotlightarXiv:2502.02562
11
citations
#2463

Large language models can learn and generalize steganographic chain-of-thought under process supervision

ROBERT MC CARTHY, Joey SKAF, Luis Ibanez-Lissen et al.

NEURIPS 2025posterarXiv:2506.01926
11
citations
#2464

XLand-100B: A Large-Scale Multi-Task Dataset for In-Context Reinforcement Learning

Alexander Nikulin, Ilya Zisman, Alexey Zemtsov et al.

ICLR 2025posterarXiv:2406.08973
11
citations
#2465

Federated Learning with Sample-level Client Drift Mitigation

Haoran Xu, Jiaze Li, Wanyi Wu et al.

AAAI 2025paperarXiv:2501.11360
11
citations
#2466

Enhancing Trustworthiness of Graph Neural Networks with Rank-Based Conformal Training

Ting Wang, Zhixin Zhou, Rui Luo

AAAI 2025paperarXiv:2501.02767
11
citations
#2467

KnowPO: Knowledge-Aware Preference Optimization for Controllable Knowledge Selection in Retrieval-Augmented Language Models

Ruizhe Zhang, Yongxin Xu, Yuzhen Xiao et al.

AAAI 2025paperarXiv:2408.03297
11
citations
#2468

Addax: Utilizing Zeroth-Order Gradients to Improve Memory Efficiency and Performance of SGD for Fine-Tuning Language Models

Zeman Li, Xinwei Zhang, Peilin Zhong et al.

ICLR 2025posterarXiv:2410.06441
11
citations
#2469

Multi-view Reconstruction via SfM-guided Monocular Depth Estimation

Haoyu Guo, He Zhu, Sida Peng et al.

CVPR 2025posterarXiv:2503.14483
11
citations
#2470

Efficiently Democratizing Medical LLMs for 50 Languages via a Mixture of Language Family Experts

Guorui Zheng, Xidong Wang, Juhao Liang et al.

ICLR 2025posterarXiv:2410.10626
11
citations
#2471

Embodied VideoAgent: Persistent Memory from Egocentric Videos and Embodied Sensors Enables Dynamic Scene Understanding

Yue Fan, Xiaojian Ma, Rongpeng Su et al.

ICCV 2025highlightarXiv:2501.00358
11
citations
#2472

FluidNexus: 3D Fluid Reconstruction and Prediction from a Single Video

Yue Gao, Hong-Xing Yu, Bo Zhu et al.

CVPR 2025posterarXiv:2503.04720
11
citations
#2473

MM-CamObj: A Comprehensive Multimodal Dataset for Camouflaged Object Scenarios

Jiacheng Ruan, Wenzhen Yuan, Zehao Lin et al.

AAAI 2025paperarXiv:2409.16084
11
citations
#2474

Rectified Diffusion Guidance for Conditional Generation

Mengfei Xia, Nan Xue, Yujun Shen et al.

CVPR 2025posterarXiv:2410.18737
11
citations
#2475

DELTA: Pre-Train a Discriminative Encoder for Legal Case Retrieval via Structural Word Alignment

Haitao Li, Qingyao Ai, Xinyan Han et al.

AAAI 2025paperarXiv:2403.18435
11
citations
#2476

Effective Training Data Synthesis for Improving MLLM Chart Understanding

Yuwei Yang, Zeyu Zhang, Yunzhong Hou et al.

ICCV 2025posterarXiv:2508.06492
11
citations
#2477

Planning in the Dark: LLM-Symbolic Planning Pipeline Without Experts

Sukai Huang, Nir Lipovetzky, Trevor Cohn

AAAI 2025paperarXiv:2409.15915
11
citations
#2478

SeaS: Few-shot Industrial Anomaly Image Generation with Separation and Sharing Fine-tuning

Zhewei Dai, Shilei Zeng, Haotian Liu et al.

ICCV 2025posterarXiv:2410.14987
11
citations
#2479

Proxy Denoising for Source-Free Domain Adaptation

Song Tang, Wenxin Su, Yan Gan et al.

ICLR 2025posterarXiv:2406.01658
11
citations
#2480

Detecting Backdoor Attacks in Federated Learning via Direction Alignment Inspection

Jiahao Xu, Zikai Zhang, Rui Hu

CVPR 2025highlightarXiv:2503.07978
11
citations
#2481

BlobGEN-Vid: Compositional Text-to-Video Generation with Blob Video Representations

Weixi Feng, Chao Liu, Sifei Liu et al.

CVPR 2025posterarXiv:2501.07647
11
citations
#2482

Node Identifiers: Compact, Discrete Representations for Efficient Graph Learning

Yuankai Luo, Hongkang Li, Qijiong Liu et al.

ICLR 2025posterarXiv:2405.16435
11
citations
#2483

FALCON: Resolving Visual Redundancy and Fragmentation in High-resolution Multimodal Large Language Models via Visual Registers

Renshan Zhang, Rui Shao, Gongwei Chen et al.

ICCV 2025posterarXiv:2501.16297
11
citations
#2484

VladVA: Discriminative Fine-tuning of LVLMs

Yassine Ouali, Adrian Bulat, ALEXANDROS XENOS et al.

CVPR 2025posterarXiv:2412.04378
11
citations
#2485

Learning Robust Stereo Matching in the Wild with Selective Mixture-of-Experts

Yun Wang, Longguang Wang, Chenghao Zhang et al.

ICCV 2025highlightarXiv:2507.04631
11
citations
#2486

Residual Stream Analysis with Multi-Layer SAEs

Tim Lawson, Lucy Farnik, Conor Houghton et al.

ICLR 2025posterarXiv:2409.04185
11
citations
#2487

Does Your Vision-Language Model Get Lost in the Long Video Sampling Dilemma?

Tianyuan Qu, Longxiang Tang, Bohao PENG et al.

ICCV 2025posterarXiv:2503.12496
11
citations
#2488

Toward Real-world BEV Perception: Depth Uncertainty Estimation via Gaussian Splatting

Shu-Wei Lu, Yi-Hsuan Tsai, Yi-Ting Chen

CVPR 2025posterarXiv:2504.01957
11
citations
#2489

TopoNets: High performing vision and language models with brain-like topography

Mayukh Deb, Mainak Deb, Apurva Murty

ICLR 2025posterarXiv:2501.16396
11
citations
#2490

KTAE: A Model-Free Algorithm to Key-Tokens Advantage Estimation in Mathematical Reasoning

Wei Sun, Wen Yang, Pu Jian et al.

NEURIPS 2025posterarXiv:2505.16826
11
citations
#2491

UniMLVG: Unified Framework for Multi-view Long Video Generation with Comprehensive Control Capabilities for Autonomous Driving

Rui Chen, Zehuan Wu, Yichen Liu et al.

ICCV 2025posterarXiv:2412.04842
11
citations
#2492

Adaptive Prompting for Continual Relation Extraction: A Within-Task Variance Perspective

Minh Le, Tien Ngoc Luu, An Nguyen The et al.

AAAI 2025paperarXiv:2412.08285
11
citations
#2493

LoRASculpt: Sculpting LoRA for Harmonizing General and Specialized Knowledge in Multimodal Large Language Models

Jian Liang, Wenke Huang, Guancheng Wan et al.

CVPR 2025posterarXiv:2503.16843
11
citations
#2494

SLIP: Spoof-Aware One-Class Face Anti-Spoofing with Language Image Pretraining

Pei-Kai Huang, Jun-Xiong Chong, Cheng-Hsuan Chiang et al.

AAAI 2025paperarXiv:2503.19982
11
citations
#2495

Advancing Mathematical Reasoning in Language Models: The Impact of Problem-Solving Data, Data Synthesis Methods, and Training Stages

Zui Chen, Tianqiao Liu, Tongqing et al.

ICLR 2025posterarXiv:2501.14002
11
citations
#2496

LBM: Latent Bridge Matching for Fast Image-to-Image Translation

Clément Chadebec, Onur Tasar, Sanjeev Sreetharan et al.

ICCV 2025highlightarXiv:2503.07535
11
citations
#2497

LoRA3D: Low-Rank Self-Calibration of 3D Geometric Foundation models

Ziqi Lu, Heng Yang, Danfei Xu et al.

ICLR 2025posterarXiv:2412.07746
11
citations
#2498

SpiritSight Agent: Advanced GUI Agent with One Look

Zhiyuan Huang, Ziming Cheng, Junting Pan et al.

CVPR 2025posterarXiv:2503.03196
11
citations
#2499

Jailbreak Antidote: Runtime Safety-Utility Balance via Sparse Representation Adjustment in Large Language Models

Guobin Shen, Dongcheng Zhao, Yiting Dong et al.

ICLR 2025posterarXiv:2410.02298
11
citations
#2500

One-for-More: Continual Diffusion Model for Anomaly Detection

Xiaofan Li, Xin Tan, Zhuo Chen et al.

CVPR 2025posterarXiv:2502.19848
11
citations
#2501

VTDexManip: A Dataset and Benchmark for Visual-tactile Pretraining and Dexterous Manipulation with Reinforcement Learning

Qingtao Liu, Yu Cui, Zhengnan Sun et al.

ICLR 2025poster
11
citations
#2502

MPTSNet: Integrating Multiscale Periodic Local Patterns and Global Dependencies for Multivariate Time Series Classification

Yang Mu, Muhammad Shahzad, Xiao Xiang Zhu

AAAI 2025paperarXiv:2503.05582
11
citations
#2503

This Time is Different: An Observability Perspective on Time Series Foundation Models

Ben Cohen, Emaad Khwaja, Youssef Doubli et al.

NEURIPS 2025posterarXiv:2505.14766
11
citations
#2504

MIDI-GPT: A Controllable Generative Model for Computer-Assisted Multitrack Music Composition

Philippe Pasquier, Jeff Ens, Nathan Fradet et al.

AAAI 2025paperarXiv:2501.17011
11
citations
#2505

GaussianSpa: An “Optimizing-Sparsifying” Simplification Framework for Compact and High-Quality 3D Gaussian Splatting

Yangming Zhang, Wenqi Jia, Wei Niu et al.

CVPR 2025posterarXiv:2411.06019
11
citations
#2506

NetMoE: Accelerating MoE Training through Dynamic Sample Placement

Xinyi Liu, Yujie Wang, Fangcheng Fu et al.

ICLR 2025poster
11
citations
#2507

OmniCount: Multi-label Object Counting with Semantic-Geometric Priors

Anindya Mondal, Sauradip Nag, Xiatian Zhu et al.

AAAI 2025paperarXiv:2403.05435
11
citations
#2508

DropoutGS: Dropping Out Gaussians for Better Sparse-view Rendering

Yexing Xu, Longguang Wang, Minglin Chen et al.

CVPR 2025posterarXiv:2504.09491
11
citations
#2509

BlockDance: Reuse Structurally Similar Spatio-Temporal Features to Accelerate Diffusion Transformers

Hui Zhang, Tingwei Gao, Jie Shao et al.

CVPR 2025posterarXiv:2503.15927
11
citations
#2510

DisasterM3: A Remote Sensing Vision-Language Dataset for Disaster Damage Assessment and Response

Junjue Wang, Weihao Xuan, Heli Qi et al.

NEURIPS 2025oralarXiv:2505.21089
11
citations
#2511

Frame Order Matters: A Temporal Sequence-Aware Model for Few-Shot Action Recognition

Bozheng Li, Mushui Liu, Gaoang Wang et al.

AAAI 2025paperarXiv:2408.12475
11
citations
#2512

Efficient Rectification of Neuro-Symbolic Reasoning Inconsistencies by Abductive Reflection

Wen-Chao Hu, Wang-Zhou Dai, Yuan Jiang et al.

AAAI 2025paperarXiv:2412.08457
11
citations
#2513

On Linear Representations and Pretraining Data Frequency in Language Models

Jack Merullo, Noah Smith, Sarah Wiegreffe et al.

ICLR 2025posterarXiv:2504.12459
11
citations
#2514

Bridging the Gap for Test-Time Multimodal Sentiment Analysis

Zirun Guo, Tao Jin, Wenlong Xu et al.

AAAI 2025paperarXiv:2412.07121
11
citations
#2515

Solver-Informed RL: Grounding Large Language Models for Authentic Optimization Modeling

Yitian Chen, Jingfan Xia, Siyu Shao et al.

NEURIPS 2025posterarXiv:2505.11792
11
citations
#2516

Envisioning Class Entity Reasoning by Large Language Models for Few-shot Learning

Mushui Liu, Fangtai Wu, Bozheng Li et al.

AAAI 2025paperarXiv:2408.12469
11
citations
#2517

Data Synthesis with Diverse Styles for Face Recognition via 3DMM-Guided Diffusion

Yuxi Mi, Zhizhou Zhong, Yuge Huang et al.

CVPR 2025posterarXiv:2504.00430
11
citations
#2518

Causal Composition Diffusion Model for Closed-loop Traffic Generation

Haohong Lin, Xin Huang, Tung Phan-Minh et al.

CVPR 2025posterarXiv:2412.17920
11
citations
#2519

Battling the Non-stationarity in Time Series Forecasting via Test-time Adaptation

HyunGi Kim, Siwon Kim, Jisoo Mok et al.

AAAI 2025paperarXiv:2501.04970
11
citations
#2520

VQ4DiT: Efficient Post-Training Vector Quantization for Diffusion Transformers

Juncan Deng, Shuaiting Li, Zeyu Wang et al.

AAAI 2025paperarXiv:2408.17131
11
citations
#2521

Stochastic Polyak Step-sizes and Momentum: Convergence Guarantees and Practical Performance

Dimitris Oikonomou, Nicolas Loizou

ICLR 2025posterarXiv:2406.04142
11
citations
#2522

GeoLLaVA-8K: Scaling Remote-Sensing Multimodal Large Language Models to 8K Resolution

Fengxiang Wang, Mingshuo Chen, Yueying Li et al.

NEURIPS 2025spotlightarXiv:2505.21375
11
citations
#2523

VideoScene: Distilling Video Diffusion Model to Generate 3D Scenes in One Step

Hanyang Wang, Fangfu Liu, Jiawei Chi et al.

CVPR 2025highlightarXiv:2504.01956
11
citations
#2524

Conformal Thresholded Intervals for Efficient Regression

Rui Luo, Zhixin Zhou

AAAI 2025paperarXiv:2407.14495
11
citations
#2525

DG-Mamba: Robust and Efficient Dynamic Graph Structure Learning with Selective State Space Models

Haonan Yuan, Qingyun Sun, Zhaonan Wang et al.

AAAI 2025paperarXiv:2412.08160
11
citations
#2526

Latent Space Super-Resolution for Higher-Resolution Image Generation with Diffusion Models

Jinho Jeong, Sangmin Han, Jinwoo Kim et al.

CVPR 2025posterarXiv:2503.18446
11
citations
#2527

Broadening Target Distributions for Accelerated Diffusion Models via a Novel Analysis Approach

Yuchen Liang, Peizhong Ju, Yingbin Liang et al.

ICLR 2025posterarXiv:2402.13901
11
citations
#2528

TweedieMix: Improving Multi-Concept Fusion for Diffusion-based Image/Video Generation

Gihyun Kwon, Jong Chul YE

ICLR 2025posterarXiv:2410.05591
11
citations
#2529

Taming Video Diffusion Prior with Scene-Grounding Guidance for 3D Gaussian Splatting from Sparse Inputs

Yingji Zhong, Zhihao Li, Dave Zhenyu Chen et al.

CVPR 2025highlightarXiv:2503.05082
11
citations
#2530

STD-PLM: Understanding Both Spatial and Temporal Properties of Spatial-Temporal Data with PLM

Yiheng Huang, Xiaowei Mao, Shengnan Guo et al.

AAAI 2025paperarXiv:2407.09096
11
citations
#2531

FSFM: A Generalizable Face Security Foundation Model via Self-Supervised Facial Representation Learning

Gaojian Wang, Feng Lin, Tong Wu et al.

CVPR 2025posterarXiv:2412.12032
11
citations
#2532

Audio-Visual Instance Segmentation

Ruohao Guo, Xianghua Ying, Yaru Chen et al.

CVPR 2025posterarXiv:2310.18709
11
citations
#2533

Once-for-All: Controllable Generative Image Compression with Dynamic Granularity Adaptation

Anqi Li, Feng Li, Yuxi Liu et al.

ICLR 2025posterarXiv:2406.00758
11
citations
#2534

Transformer-Squared: Self-adaptive LLMs

Qi Sun, Edoardo Cetin, Yujin Tang

ICLR 2025posterarXiv:2501.06252
11
citations
#2535

Lifting Motion to the 3D World via 2D Diffusion

Jiaman Li, Karen Liu, Jiajun Wu

CVPR 2025highlightarXiv:2411.18808
11
citations
#2536

OmniStyle: Filtering High Quality Style Transfer Data at Scale

Ye Wang, Ruiqi Liu, Jiang Lin et al.

CVPR 2025posterarXiv:2505.14028
11
citations
#2537

Context Steering: Controllable Personalization at Inference Time

Zhiyang He, Sashrika Pandey, Mariah Schrum et al.

ICLR 2025posterarXiv:2405.01768
11
citations
#2538

Differentiable Optimization of Similarity Scores Between Models and Brains

Nathan Cloos, Moufan Li, Markus Siegel et al.

ICLR 2025posterarXiv:2407.07059
11
citations
#2539

CodeSteer: Symbolic-Augmented Language Models via Code/Text Guidance

Yongchao Chen, Yilun Hao, Yueying Liu et al.

ICML 2025posterarXiv:2502.04350
11
citations
#2540

Bridging the User-side Knowledge Gap in Knowledge-aware Recommendations with Large Language Models

Zheng Hu, Zhe Li, Ziyun Jiao et al.

AAAI 2025paperarXiv:2412.13544
11
citations
#2541

Tra-MoE: Learning Trajectory Prediction Model from Multiple Domains for Adaptive Policy Conditioning

Jiange Yang, Haoyi Zhu, Yating Wang et al.

CVPR 2025posterarXiv:2411.14519
11
citations
#2542

TIGER: Time-frequency Interleaved Gain Extraction and Reconstruction for Efficient Speech Separation

Mohan Xu, Kai Li, Guo Chen et al.

ICLR 2025oralarXiv:2410.01469
11
citations
#2543

RoboTron-Drive: All-in-One Large Multimodal Model for Autonomous Driving

Zhijian Huang, Chengjian Feng, Baihui Xiao et al.

ICCV 2025posterarXiv:2412.07689
11
citations
#2544

WildSAT: Learning Satellite Image Representations from Wildlife Observations

Rangel Daroya, Elijah Cole, Oisin Mac Aodha et al.

ICCV 2025posterarXiv:2412.14428
10
citations
#2545

SuperDec: 3D Scene Decomposition with Superquadrics Primitives

Elisabetta Fedele, Boyang Sun, Francis Engelmann et al.

ICCV 2025posterarXiv:2504.00992
10
citations
#2546

General Scene Adaptation for Vision-and-Language Navigation

Haodong Hong, Yanyuan Qiao, Sen Wang et al.

ICLR 2025posterarXiv:2501.17403
10
citations
#2547

CADDreamer: CAD Object Generation from Single-view Images

Yuan Li, Cheng Lin, Yuan Liu et al.

CVPR 2025highlightarXiv:2502.20732
10
citations
#2548

Light-T2M: A Lightweight and Fast Model for Text-to-motion Generation

Ling-An Zeng, Guohong Huang, Gaojie Wu et al.

AAAI 2025paperarXiv:2412.11193
10
citations
#2549

Model-based RL as a Minimalist Approach to Horizon-Free and Second-Order Bounds

Zhiyong Wang, Dongruo Zhou, John C.S. Lui et al.

ICLR 2025posterarXiv:2408.08994
10
citations
#2550

Holistic Reasoning with Long-Context LMs: A Benchmark for Database Operations on Massive Textual Data

Seiji Maekawa, Hayate Iso, Nikita Bhutani

ICLR 2025posterarXiv:2410.11996
10
citations
#2551

Flexible Frame Selection for Efficient Video Reasoning

Shyamal Buch, Arsha Nagrani, Anurag Arnab et al.

CVPR 2025poster
10
citations
#2552

DocLayLLM: An Efficient Multi-modal Extension of Large Language Models for Text-rich Document Understanding

Wenhui Liao, Jiapeng Wang, Hongliang Li et al.

CVPR 2025posterarXiv:2408.15045
10
citations
#2553

FedMIA: An Effective Membership Inference Attack Exploiting "All for One" Principle in Federated Learning

Gongxi Zhu, Donghao Li, Hanlin Gu et al.

CVPR 2025poster
10
citations
#2554

Sports-Traj: A Unified Trajectory Generation Model for Multi-Agent Movement in Sports

Yi Xu, Yun Fu

ICLR 2025oralarXiv:2405.17680
10
citations
#2555

Generate, but Verify: Reducing Hallucination in Vision-Language Models with Retrospective Resampling

Tsung-Han (Patrick) Wu, Heekyung Lee, Jiaxin Ge et al.

NEURIPS 2025posterarXiv:2504.13169
10
citations
#2556

Overcoming False Illusions in Real-World Face Restoration with Multi-Modal Guided Diffusion Model

Keda TAO, Jinjin Gu, Yulun Zhang et al.

ICLR 2025posterarXiv:2410.04161
10
citations
#2557

Unlearn and Burn: Adversarial Machine Unlearning Requests Destroy Model Accuracy

Yangsibo Huang, Daogao Liu, Lynn Chua et al.

ICLR 2025posterarXiv:2410.09591
10
citations
#2558

Backdoor Attacks Against No-Reference Image Quality Assessment Models via a Scalable Trigger

Yi Yu, Song Xia, Xun Lin et al.

AAAI 2025paperarXiv:2412.07277
10
citations
#2559

Training-free LLM-generated Text Detection by Mining Token Probability Sequences

Yihuai Xu, Yongwei Wang, YIFEI BI et al.

ICLR 2025oralarXiv:2410.06072
10
citations
#2560

Generalized Recorrupted-to-Recorrupted: Self-Supervised Learning Beyond Gaussian Noise

Brayan Monroy, Jorge Bacca, Julián Tachella

CVPR 2025posterarXiv:2412.04648
10
citations
#2561

SSLAM: Enhancing Self-Supervised Models with Audio Mixtures for Polyphonic Soundscapes

Tony Alex, Sara Atito, Armin Mustafa et al.

ICLR 2025posterarXiv:2506.12222
10
citations
#2562

Reconstructing People, Places, and Cameras

Lea Müller, Hongsuk Choi, Anthony Zhang et al.

CVPR 2025highlightarXiv:2412.17806
10
citations
#2563

Taylor Series-Inspired Local Structure Fitting Network for Few-shot Point Cloud Semantic Segmentation

Changshuo Wang, Shuting He, Xiang Fang et al.

AAAI 2025paperarXiv:2504.02454
10
citations
#2564

Efficient Attention-Sharing Information Distillation Transformer for Lightweight Single Image Super-Resolution

Karam Park, Jae Woong Soh, Nam Ik Cho

AAAI 2025paperarXiv:2501.15774
10
citations
#2565

Noise Stability Optimization for Finding Flat Minima: A Hessian-based Regularization Approach

Haotian Ju, Hongyang Zhang, Dongyue Li

ICLR 2025posterarXiv:2306.08553
10
citations
#2566

InsightEdit: Towards Better Instruction Following for Image Editing

Yingjing Xu, Jie Kong, Jiazhi Wang et al.

CVPR 2025posterarXiv:2411.17323
10
citations
#2567

ReCap: Better Gaussian Relighting with Cross-Environment Captures

Jingzhi Li, Zongwei Wu, Eduard Zamfir et al.

CVPR 2025posterarXiv:2412.07534
10
citations
#2568

On the Crucial Role of Initialization for Matrix Factorization

Bingcong Li, Liang Zhang, Aryan Mokhtari et al.

ICLR 2025posterarXiv:2410.18965
10
citations
#2569

Gazing Into Missteps: Leveraging Eye-Gaze for Unsupervised Mistake Detection in Egocentric Videos of Skilled Human Activities

Michele Mazzamuto, Antonino Furnari, Yoichi Sato et al.

CVPR 2025posterarXiv:2406.08379
10
citations
#2570

Enhancing Multilingual LLM Pretraining with Model-Based Data Selection

Bettina Messmer, Vinko Sabolčec, Martin Jaggi

NEURIPS 2025posterarXiv:2502.10361
10
citations
#2571

CameraCtrl II: Dynamic Scene Exploration via Camera-controlled Video Diffusion Models

Hao He, Ceyuan Yang, Shanchuan Lin et al.

ICCV 2025posterarXiv:2503.10592
10
citations
#2572

Beyond Walking: A Large-Scale Image-Text Benchmark for Text-based Person Anomaly Search

Shuyu Yang, Yaxiong Wang, Li Zhu et al.

ICCV 2025highlightarXiv:2411.17776
10
citations
#2573

EmotiCrafter: Text-to-Emotional-Image Generation based on Valence-Arousal Model

Shengqi Dang, Yi He, Long Ling et al.

ICCV 2025posterarXiv:2501.05710
10
citations
#2574

ASAP: Advancing Semantic Alignment Promotes Multi-Modal Manipulation Detecting and Grounding

Zhenxing Zhang, Yaxiong Wang, Lechao Cheng et al.

CVPR 2025posterarXiv:2412.12718
10
citations
#2575

MindLLM: A Subject-Agnostic and Versatile Model for fMRI-to-text Decoding

Weikang Qiu, Zheng Huang, Haoyu Hu et al.

ICML 2025posterarXiv:2502.15786
10
citations
#2576

Fast training and sampling of Restricted Boltzmann Machines

Nicolas BEREUX, Aurélien Decelle, Cyril Furtlehner et al.

ICLR 2025posterarXiv:2405.15376
10
citations
#2577

Open-World Reinforcement Learning over Long Short-Term Imagination

Jiajian Li, Qi Wang, Yunbo Wang et al.

ICLR 2025posterarXiv:2410.03618
10
citations
#2578

Generalized Few-shot 3D Point Cloud Segmentation with Vision-Language Model

Zhaochong An, Guolei Sun, Yun Liu et al.

CVPR 2025posterarXiv:2503.16282
10
citations
#2579

Attention layers provably solve single-location regression

Pierre Marion, Raphaël Berthier, Gérard Biau et al.

ICLR 2025posterarXiv:2410.01537
10
citations
#2580

Nautilus: Locality-aware Autoencoder for Scalable Mesh Generation

Yuxuan Wang, Xuanyu Yi, Haohan Weng et al.

ICCV 2025posterarXiv:2501.14317
10
citations
#2581

Efficient Gaussian Splatting for Monocular Dynamic Scene Rendering via Sparse Time-Variant Attribute Modeling

Hanyang Kong, Xingyi Yang, Xinchao Wang

AAAI 2025paperarXiv:2502.20378
10
citations
#2582

More RLHF, More Trust? On The Impact of Preference Alignment On Trustworthiness

Aaron J. Li, Satyapriya Krishna, Hima Lakkaraju

ICLR 2025posterarXiv:2404.18870
10
citations
#2583

Diff2Flow: Training Flow Matching Models via Diffusion Model Alignment

Johannes Schusterbauer, Ming Gui, Frank Fundel et al.

CVPR 2025posterarXiv:2506.02221
10
citations
#2584

Graph Neural Preconditioners for Iterative Solutions of Sparse Linear Systems

Jie Chen

ICLR 2025posterarXiv:2406.00809
10
citations
#2585

From Flat to Hierarchical: Extracting Sparse Representations with Matching Pursuit

Valérie Costa, Thomas Fel, Ekdeep S Lubana et al.

NEURIPS 2025posterarXiv:2506.03093
10
citations
#2586

MolParser: End-to-end Visual Recognition of Molecule Structures in the Wild

Xi Fang, Jiankun Wang, Xiaochen Cai et al.

ICCV 2025posterarXiv:2411.11098
10
citations
#2587

Lightweight Neural App Control

Filippos Christianos, Georgios Papoudakis, Thomas Coste et al.

ICLR 2025posterarXiv:2410.17883
10
citations
#2588

The 3D-PC: a benchmark for visual perspective taking in humans and machines

Drew Linsley, Peisen Zhou, Alekh Ashok et al.

ICLR 2025posterarXiv:2406.04138
10
citations
#2589

MAD-TD: Model-Augmented Data stabilizes High Update Ratio RL

Claas Voelcker, Marcel Hussing, ERIC EATON et al.

ICLR 2025posterarXiv:2410.08896
10
citations
#2590

OmniConsistency: Learning Style-Agnostic Consistency from Paired Stylization Data

Yiren Song, Cheng Liu, Mike Zheng Shou

NEURIPS 2025posterarXiv:2505.18445
10
citations
#2591

AlphaPO: Reward Shape Matters for LLM Alignment

Aman Gupta, Shao Tang, Qingquan Song et al.

ICML 2025posterarXiv:2501.03884
10
citations
#2592

AdaRankGrad: Adaptive Gradient Rank and Moments for Memory-Efficient LLMs Training and Fine-Tuning

Yehonathan Refael, Jonathan Svirsky, Boris Shustin et al.

ICLR 2025posterarXiv:2410.17881
10
citations
#2593

Constrain Alignment with Sparse Autoencoders

Qingyu Yin, Chak Tou Leong, Hongbo Zhang et al.

ICML 2025posterarXiv:2411.07618
10
citations
#2594

STOP: Integrated Spatial-Temporal Dynamic Prompting for Video Understanding

Zichen Liu, Kunlun Xu, Bing Su et al.

CVPR 2025posterarXiv:2503.15973
10
citations
#2595

Atlas Gaussians Diffusion for 3D Generation

Haitao Yang, Yuan Dong, Hanwen Jiang et al.

ICLR 2025posterarXiv:2408.13055
10
citations
#2596

A Large Recurrent Action Model: xLSTM enables Fast Inference for Robotics Tasks

Thomas Schmied, Thomas Adler, Vihang Patil et al.

ICML 2025posterarXiv:2410.22391
10
citations
#2597

A Recipe for Generating 3D Worlds from a Single Image

Katja Schwarz, Denis Rozumny, Samuel Rota Bulò et al.

ICCV 2025posterarXiv:2503.16611
10
citations
#2598

Are They the Same? Exploring Visual Correspondence Shortcomings of Multimodal LLMs

Yikang Zhou, Tao Zhang, Shilin Xu et al.

ICCV 2025posterarXiv:2501.04670
10
citations
#2599

VideoUFO: A Million-Scale User-Focused Dataset for Text-to-Video Generation

Wenhao Wang, Yi Yang

NEURIPS 2025posterarXiv:2503.01739
10
citations
#2600

Consistency Checks for Language Model Forecasters

Daniel Paleka, Abhimanyu Pallavi Sudhir, Alejandro Alvarez et al.

ICLR 2025posterarXiv:2412.18544
10
citations