Most Cited 2025 "color correction" Papers

22,274 papers found • Page 18 of 112

#3401

IDA-VLM: Towards Movie Understanding via ID-Aware Large Vision-Language Model

Yatai Ji, Shilong Zhang, Jie Wu et al.

ICLR 2025posterarXiv:2407.07577
8
citations
#3402

CPathAgent: An Agent-based Foundation Model for Interpretable High-Resolution Pathology Image Analysis Mimicking Pathologists' Diagnostic Logic

YUXUAN SUN, Yixuan Si, Chenglu Zhu et al.

NEURIPS 2025posterarXiv:2505.20510
8
citations
#3403

Scaling Off-Policy Reinforcement Learning with Batch and Weight Normalization

Daniel Palenicek, Florian Vogt, Joe Watson et al.

NEURIPS 2025posterarXiv:2502.07523
8
citations
#3404

Web-Shepherd: Advancing PRMs for Reinforcing Web Agents

Hyungjoo Chae, Seonghwan Kim, Junhee Cho et al.

NEURIPS 2025spotlightarXiv:2505.15277
8
citations
#3405

Causally Motivated Sycophancy Mitigation for Large Language Models

Haoxi Li, Xueyang Tang, Jie ZHANG et al.

ICLR 2025poster
8
citations
#3406

T2ICount: Enhancing Cross-modal Understanding for Zero-Shot Counting

Yifei Qian, Zhongliang Guo, Bowen Deng et al.

CVPR 2025highlightarXiv:2502.20625
8
citations
#3407

Focusing on Tracks for Online Multi-Object Tracking

Kyujin Shim, Kangwook Ko, YuJin Yang et al.

CVPR 2025poster
8
citations
#3408

Synthetic Prior for Few-Shot Drivable Head Avatar Inversion

Wojciech Zielonka, Stephan J. Garbin, Alexandros Lattas et al.

CVPR 2025posterarXiv:2501.06903
8
citations
#3409

DiffFNO: Diffusion Fourier Neural Operator

Xiaoyi Liu, Hao Tang

CVPR 2025posterarXiv:2411.09911
8
citations
#3410

Hypergraph Vision Transformers: Images are More than Nodes, More than Edges

Joshua Fixelle

CVPR 2025posterarXiv:2504.08710
8
citations
#3411

An Item Is Worth a Prompt: Versatile Image Editing with Disentangled Control

Aosong Feng, Weikang Qiu, Jinbin Bai et al.

AAAI 2025paperarXiv:2403.04880
8
citations
#3412

CAPrompt: Cyclic Prompt Aggregation for Pre-Trained Model Based Class Incremental Learning

Qiwei Li, Jiahuan Zhou

AAAI 2025paperarXiv:2412.08929
8
citations
#3413

OpenViewer: Openness-Aware Multi-View Learning

Shide Du, Zihan Fang, Yanchao Tan et al.

AAAI 2025paperarXiv:2412.12596
8
citations
#3414

How new data permeates LLM knowledge and how to dilute it

Chen Sun, Renat Aksitov, Andrey Zhmoginov et al.

ICLR 2025posterarXiv:2504.09522
8
citations
#3415

Unsupervised Audio-Visual Segmentation with Modality Alignment

Swapnil Bhosale, Haosen Yang, Diptesh Kanojia et al.

AAAI 2025paperarXiv:2403.14203
8
citations
#3416

Improving LLMs for Recommendation with Out-Of-Vocabulary Tokens

Ting-Ji Huang, Jia-Qi Yang, Chunxu Shen et al.

ICML 2025posterarXiv:2406.08477
8
citations
#3417

Can a MISL Fly? Analysis and Ingredients for Mutual Information Skill Learning

Chongyi Zheng, Jens Tuyls, Joanne Peng et al.

ICLR 2025posterarXiv:2412.08021
8
citations
#3418

Noise Calibration and Spatial-Frequency Interactive Network for STEM Image Enhancement

Hesong Li, Ziqi Wu, Ruiwen Shao et al.

CVPR 2025posterarXiv:2504.02555
8
citations
#3419

DynamicID: Zero-Shot Multi-ID Image Personalization with Flexible Facial Editability

Xirui Hu, Jiahao Wang, Hao chen et al.

ICCV 2025posterarXiv:2503.06505
8
citations
#3420

A General Framework for Producing Interpretable Semantic Text Embeddings

Yiqun Sun, Qiang Huang, Yixuan Tang et al.

ICLR 2025posterarXiv:2410.03435
8
citations
#3421

DAViD: Modeling Dynamic Affordance of 3D Objects Using Pre-trained Video Diffusion Models

Hyeonwoo Kim, Sangwon Baik, Hanbyul Joo

ICCV 2025posterarXiv:2501.08333
8
citations
#3422

Online Guidance Graph Optimization for Lifelong Multi-Agent Path Finding

Hongzhi Zang, Yulun Zhang, He Jiang et al.

AAAI 2025paperarXiv:2411.16506
8
citations
#3423

Conformal Prediction Sets Can Cause Disparate Impact

Jesse Cresswell, Bhargava Kumar, Yi Sui et al.

ICLR 2025posterarXiv:2410.01888
8
citations
#3424

LoRA-One: One-Step Full Gradient Could Suffice for Fine-Tuning Large Language Models, Provably and Efficiently

Yuanhe Zhang, Fanghui Liu, Yudong Chen

ICML 2025oralarXiv:2502.01235
8
citations
#3425

Solving Inverse Problems with FLAIR

Julius Erbach, Dominik Narnhofer, Andreas Dombos et al.

NEURIPS 2025posterarXiv:2506.02680
8
citations
#3426

Unveiling Differences in Generative Models: A Scalable Differential Clustering Approach

Jingwei Zhang, Mohammad Jalali, Cheuk Ting Li et al.

CVPR 2025highlightarXiv:2405.02700
8
citations
#3427

AutoAdvExBench: Benchmarking Autonomous Exploitation of Adversarial Example Defenses

Nicholas Carlini, Edoardo Debenedetti, Javier Rando et al.

ICML 2025oralarXiv:2503.01811
8
citations
#3428

Self-Discriminative Modeling for Anomalous Graph Detection

Jinyu Cai, Yunhe Zhang, Jicong Fan

ICML 2025posterarXiv:2310.06261
8
citations
#3429

Data Unlearning in Diffusion Models

Silas Alberti, Kenan Hasanaliyev, Manav Shah et al.

ICLR 2025posterarXiv:2503.01034
8
citations
#3430

Unlocking the Power of Function Vectors for Characterizing and Mitigating Catastrophic Forgetting in Continual Instruction Tuning

Gangwei Jiang, caigao jiang, Zhaoyi Li et al.

ICLR 2025posterarXiv:2502.11019
8
citations
#3431

Gaussian Mixture Flow Matching Models

Hansheng Chen, Kai Zhang, Hao Tan et al.

ICML 2025posterarXiv:2504.05304
8
citations
#3432

PRE-Mamba: A 4D State Space Model for Ultra-High-Frequent Event Camera Deraining

Ciyu Ruan, Ruishan Guo, Zihang GONG et al.

ICCV 2025posterarXiv:2505.05307
8
citations
#3433

Efficient Reward Poisoning Attacks on Online Deep Reinforcement Learning

Yinglun Xu, Qi Zeng, Gagandeep Singh

ICLR 2025posterarXiv:2205.14842
8
citations
#3434

Federated Domain Generalization with Data-free On-server Matching Gradient

Binh Nguyen, Minh-Duong Nguyen, Jinsun Park et al.

ICLR 2025posterarXiv:2501.14653
8
citations
#3435

Direct Alignment with Heterogeneous Preferences

Ali Shirali, Arash Nasr-Esfahany, Abdullah Alomar et al.

NEURIPS 2025posterarXiv:2502.16320
8
citations
#3436

(Almost Full) EFX for Three (and More) Types of Agents

Pratik Ghosal, Vishwa Prakash HV, Prajakta Nimbhorkar et al.

AAAI 2025paperarXiv:2301.10632
8
citations
#3437

Step-by-Step Reasoning for Math Problems via Twisted Sequential Monte Carlo

Shengyu Feng, Xiang Kong, shuang ma et al.

ICLR 2025posterarXiv:2410.01920
8
citations
#3438

Face-Human-Bench: A Comprehensive Benchmark of Face and Human Understanding for Multi-modal Assistants

Lixiong Qin, Shilong Ou, Miaoxuan Zhang et al.

NEURIPS 2025posterarXiv:2501.01243
8
citations
#3439

UniPCGC: Towards Practical Point Cloud Geometry Compression via an Efficient Unified Approach

Kangli Wang, Wei Gao

AAAI 2025paperarXiv:2503.18541
8
citations
#3440

Free Hunch: Denoiser Covariance Estimation for Diffusion Models Without Extra Costs

Severi Rissanen, Markus Heinonen, Arno Solin

ICLR 2025posterarXiv:2410.11149
8
citations
#3441

Re2LLM: Reflective Reinforcement Large Language Model for Session-based Recommendation

Ziyan Wang, Yingpeng Du, Zhu Sun et al.

AAAI 2025paperarXiv:2403.16427
8
citations
#3442

A Closer Look at Multimodal Representation Collapse

Abhra Chaudhuri, Anjan Dutta, Tu Bui et al.

ICML 2025spotlightarXiv:2505.22483
8
citations
#3443

Incomplete Multi-view Deep Clustering with Data Imputation and Alignment

Jiyuan Liu, Xinwang Liu, Xinhang Wan et al.

NEURIPS 2025poster
8
citations
#3444

Language Models Trained to do Arithmetic Predict Human Risky and Intertemporal Choice

Jian-Qiao Zhu, Haijiang Yan, Thomas L. Griffiths

ICLR 2025oralarXiv:2405.19313
8
citations
#3445

LLMs Encode Harmfulness and Refusal Separately

Jiachen Zhao, Jing Huang, Zhengxuan Wu et al.

NEURIPS 2025posterarXiv:2507.11878
8
citations
#3446

PanTS: The Pancreatic Tumor Segmentation Dataset

Wenxuan Li, Xinze Zhou, Qi Chen et al.

NEURIPS 2025posterarXiv:2507.01291
8
citations
#3447

Diffusion-based Synthetic Data Generation for Visible-Infrared Person Re-Identification

Wenbo Dai, Lijing Lu, Zhihang Li

AAAI 2025paperarXiv:2503.12472
8
citations
#3448

Ada-K Routing: Boosting the Efficiency of MoE-based LLMs

Zijia Zhao, Longteng Guo, Jie Cheng et al.

ICLR 2025posterarXiv:2410.10456
8
citations
#3449

Compositional simulation-based inference for time series

Manuel Gloeckler, Shoji Toyota, Kenji Fukumizu et al.

ICLR 2025posterarXiv:2411.02728
8
citations
#3450

MoEQuant: Enhancing Quantization for Mixture-of-Experts Large Language Models via Expert-Balanced Sampling and Affinity Guidance

Zhixuan Chen, Xing Hu, Dawei Yang et al.

ICML 2025posterarXiv:2505.03804
8
citations
#3451

Visual Jenga: Discovering Object Dependencies via Counterfactual Inpainting

Anand Bhattad, Konpat Preechakul, Alexei Efros

NEURIPS 2025posterarXiv:2503.21770
8
citations
#3452

COCONut-PanCap: Joint Panoptic Segmentation and Grounded Captions for Fine-Grained Understanding and Generation

Xueqing Deng, Linjie Yang, Qihang Yu et al.

NEURIPS 2025posterarXiv:2502.02589
8
citations
#3453

FAVOR-Bench: A Comprehensive Benchmark for Fine-Grained Video Motion Understanding

Chongjun Tu, Lin Zhang, pengtao chen et al.

NEURIPS 2025oralarXiv:2503.14935
8
citations
#3454

Towards Unifying Evaluation of Counterfactual Explanations: Leveraging Large Language Models for Human-Centric Assessments

Marharyta Domnich, Julius Välja, Rasmus Moorits Veski et al.

AAAI 2025paperarXiv:2410.21131
8
citations
#3455

RoboPEPP: Vision-Based Robot Pose and Joint Angle Estimation through Embedding Predictive Pre-Training

Raktim Gautam Goswami, Prashanth Krishnamurthy, Yann LeCun et al.

CVPR 2025highlightarXiv:2411.17662
8
citations
#3456

MLZero: A Multi-Agent System for End-to-end Machine Learning Automation

Haoyang Fang, Boran Han, Nick Erickson et al.

NEURIPS 2025posterarXiv:2505.13941
8
citations
#3457

Two Experts Are All You Need for Steering Thinking: Reinforcing Cognitive Effort in MoE Reasoning Models Without Additional Training

Mengru Wang, Xingyu Chen, Yue Wang et al.

NEURIPS 2025posterarXiv:2505.14681
8
citations
#3458

A transfer learning framework for weak to strong generalization

Seamus Somerstep, Felipe Maia Polo, Moulinath Banerjee et al.

ICLR 2025poster
8
citations
#3459

Deep Weight Factorization: Sparse Learning Through the Lens of Artificial Symmetries

Chris Kolb, Tobias Weber, Bernd Bischl et al.

ICLR 2025posterarXiv:2502.02496
8
citations
#3460

Learning normalized image densities via dual score matching

Florentin Guth, Zahra Kadkhodaie, Eero Simoncelli

NEURIPS 2025posterarXiv:2506.05310
8
citations
#3461

Sample complexity of data-driven tuning of model hyperparameters in neural networks with structured parameter-dependent dual function

Maria-Florina Balcan, Anh Nguyen, Dravyansh Sharma

NEURIPS 2025posterarXiv:2501.13734
8
citations
#3462

De-mark: Watermark Removal in Large Language Models

Ruibo Chen, Yihan Wu, Junfeng Guo et al.

ICML 2025posterarXiv:2410.13808
8
citations
#3463

Physics of Language Models: Part 4.1, Architecture Design and the Magic of Canon Layers

Zeyuan Allen-Zhu

NEURIPS 2025posterarXiv:2512.17351
8
citations
#3464

VIoTGPT: Learning to Schedule Vision Tools Towards Intelligent Video Internet of Things

Yaoyao Zhong, Mengshi Qi, Rui Wang et al.

AAAI 2025paper
8
citations
#3465

Re-HOLD: Video Hand Object Interaction Reenactment via adaptive Layout-instructed Diffusion Model

Yingying Fan, Quanwei Yang, Kaisiyuan Wang et al.

CVPR 2025posterarXiv:2503.16942
8
citations
#3466

STAR: Synthesis of Tailored Architectures

Armin Thomas, Rom Parnichkun, Alexander Amini et al.

ICLR 2025posterarXiv:2411.17800
8
citations
#3467

Lost in Latent Space: An Empirical Study of Latent Diffusion Models for Physics Emulation

François Rozet, Ruben Ohana, Michael McCabe et al.

NEURIPS 2025posterarXiv:2507.02608
8
citations
#3468

Why Masking Diffusion Works: Condition on the Jump Schedule for Improved Discrete Diffusion

Alan Amin, Nate Gruver, Andrew Wilson

NEURIPS 2025posterarXiv:2506.08316
8
citations
#3469

Adaptive Learn-then-Test: Statistically Valid and Efficient Hyperparameter Selection

Matteo Zecchin, Sangwoo Park, Osvaldo Simeone

ICML 2025spotlightarXiv:2409.15844
8
citations
#3470

Sensor-Invariant Tactile Representation

Harsh Gupta, Yuchen Mo, Shengmiao Jin et al.

ICLR 2025posterarXiv:2502.19638
8
citations
#3471

DTGBrepGen: A Novel B-rep Generative Model through Decoupling Topology and Geometry

Jing Li, Yihang Fu, Falai Chen

CVPR 2025posterarXiv:2503.13110
8
citations
#3472

INST-IT: Boosting Instance Understanding via Explicit Visual Prompt Instruction Tuning

Wujian Peng, Lingchen Meng, Yitong Chen et al.

NEURIPS 2025oralarXiv:2412.03565
8
citations
#3473

Fair Submodular Cover

Wenjing Chen, Shuo Xing, Samson Zhou et al.

ICLR 2025posterarXiv:2407.04804
8
citations
#3474

SIGMAN: Scaling 3D Human Gaussian Generation with Millions of Assets

Yuhang Yang, Fengqi Liu, Yixing Lu et al.

ICCV 2025poster
8
citations
#3475

Mask^2DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long Video Generation

Tianhao Qi, Jianlong Yuan, Wanquan Feng et al.

CVPR 2025poster
8
citations
#3476

DroneSplat: 3D Gaussian Splatting for Robust 3D Reconstruction from In-the-Wild Drone Imagery

Jiadong Tang, Yu Gao, Dianyi Yang et al.

CVPR 2025highlightarXiv:2503.16964
8
citations
#3477

Bayesian Optimization of Antibodies Informed by a Generative Model of Evolving Sequences

Alan Amin, Nate Gruver, Yilun Kuang et al.

ICLR 2025posterarXiv:2412.07763
8
citations
#3478

Solving Token Gradient Conflict in Mixture-of-Experts for Large Vision-Language Model

Longrong Yang, Dong Shen, Chaoxiang Cai et al.

ICLR 2025posterarXiv:2406.19905
8
citations
#3479

ROD-MLLM: Towards More Reliable Object Detection in Multimodal Large Language Models

Heng Yin, Yuqiang Ren, Ke Yan et al.

CVPR 2025poster
8
citations
#3480

SonicSim: A customizable simulation platform for speech processing in moving sound source scenarios

Kai Li, Wendi Sang, Chang Zeng et al.

ICLR 2025posterarXiv:2410.01481
8
citations
#3481

CaPo: Cooperative Plan Optimization for Efficient Embodied Multi-Agent Cooperation

Jie Liu, Pan Zhou, Yingjun Du et al.

ICLR 2025posterarXiv:2411.04679
8
citations
#3482

BRiTE: Bootstrapping Reinforced Thinking Process to Enhance Language Model Reasoning

Han Zhong, Yutong Yin, Shenao Zhang et al.

ICML 2025posterarXiv:2501.18858
8
citations
#3483

X-Fusion: Introducing New Modality to Frozen Large Language Models

Sicheng Mo, Thao Nguyen, Xun Huang et al.

ICCV 2025posterarXiv:2504.20996
8
citations
#3484

MUNBa: Machine Unlearning via Nash Bargaining

Jing Wu, Mehrtash Harandi

ICCV 2025posterarXiv:2411.15537
8
citations
#3485

Generating Physically Stable and Buildable Brick Structures from Text

Ava Pun, Kangle Deng, Ruixuan Liu et al.

ICCV 2025posterarXiv:2505.05469
8
citations
#3486

Motion-adaptive Transformer for Event-based Image Deblurring

Senyan Xu, Zhijing Sun, Mingchen Zhong et al.

AAAI 2025paper
8
citations
#3487

TG-LLaVA: Text Guided LLaVA via Learnable Latent Embeddings

Dawei Yan, Pengcheng Li, Yang Li et al.

AAAI 2025paperarXiv:2409.09564
8
citations
#3488

From Experts to a Generalist: Toward General Whole-Body Control for Humanoid Robots

Yuxuan Wang, Ming Yang, Gang Ding et al.

NEURIPS 2025oralarXiv:2506.12779
8
citations
#3489

VIKI‑R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement Learning

Li Kang, Xiufeng Song, Heng Zhou et al.

NEURIPS 2025posterarXiv:2506.09049
8
citations
#3490

HELM: Hyperbolic Large Language Models via Mixture-of-Curvature Experts

Neil He, Rishabh Anand, Hiren Madhu et al.

NEURIPS 2025posterarXiv:2505.24722
8
citations
#3491

The Effectiveness of Curvature-Based Rewiring and the Role of Hyperparameters in GNNs Revisited

Floriano Tori, Vincent Holst, Vincent Ginis

ICLR 2025posterarXiv:2407.09381
8
citations
#3492

Injecting Universal Jailbreak Backdoors into LLMs in Minutes

Zhuowei Chen, qiannan zhang, Shichao Pei

ICLR 2025posterarXiv:2502.10438
8
citations
#3493

PICO: Reconstructing 3D People In Contact with Objects

Alpár Cseke, Shashank Tripathi, Sai Kumar Dwivedi et al.

CVPR 2025posterarXiv:2504.17695
8
citations
#3494

VideoOrion: Tokenizing Object Dynamics in Videos

Yicheng Feng, Yijiang Li, Wanpeng Zhang et al.

ICCV 2025posterarXiv:2411.16156
8
citations
#3495

Tartan IMU: A Light Foundation Model for Inertial Positioning in Robotics

Shibo Zhao, Sifan Zhou, Raphael Blanchard et al.

CVPR 2025poster
8
citations
#3496

From One to More: Contextual Part Latents for 3D Generation

Shaocong Dong, Lihe Ding, Xiao Chen et al.

ICCV 2025posterarXiv:2507.08772
8
citations
#3497

How do Transformers Learn Implicit Reasoning?

Jiaran Ye, Zijun Yao, Zhidian Huang et al.

NEURIPS 2025oralarXiv:2505.23653
8
citations
#3498

Expected Sliced Transport Plans

Xinran Liu, Rocio Diaz Martin, Yikun Bai et al.

ICLR 2025posterarXiv:2410.12176
8
citations
#3499

Geometry-aware RL for Manipulation of Varying Shapes and Deformable Objects

Tai Hoang, Huy Le, Philipp Becker et al.

ICLR 2025posterarXiv:2502.07005
8
citations
#3500

Fast and Robust: Task Sampling with Posterior and Diversity Synergies for Adaptive Decision-Makers in Randomized Environments

Yun Qu, Cheems Wang, Yixiu Mao et al.

ICML 2025posterarXiv:2504.19139
8
citations
#3501

Finer-CAM: Spotting the Difference Reveals Finer Details for Visual Explanation

Ziheng Zhang, Jianyang Gu, Arpita Chowdhury et al.

CVPR 2025posterarXiv:2501.11309
8
citations
#3502

Dehaze-RetinexGAN: Real-World Image Dehazing via Retinex-based Generative Adversarial Network

Xinran Wang, Guang Yang, Tian Ye et al.

AAAI 2025paper
8
citations
#3503

Sail into the Headwind: Alignment via Robust Rewards and Dynamic Labels against Reward Hacking

Paria Rashidinejad, Yuandong Tian

ICLR 2025posterarXiv:2412.09544
8
citations
#3504

DELIFT: Data Efficient Language model Instruction Fine-Tuning

Ishika Agarwal, Krishnateja Killamsetty, Lucian Popa et al.

ICLR 2025posterarXiv:2411.04425
8
citations
#3505

Trace3D: Consistent Segmentation Lifting via Gaussian Instance Tracing

Hongyu Shen, Junfeng Ni, Weishuo Li et al.

ICCV 2025posterarXiv:2508.03227
8
citations
#3506

VISION-XL: High Definition Video Inverse Problem Solver using Latent Image Diffusion Models

Taesung Kwon, Jong Ye

ICCV 2025posterarXiv:2412.00156
8
citations
#3507

From Specificity to Generality: Revisiting Generalizable Artifacts in Detecting Face Deepfakes

Long Ma, Zhiyuan Yan, Jin Xu et al.

NEURIPS 2025posterarXiv:2504.04827
8
citations
#3508

NullSwap: Proactive Identity Cloaking Against Deepfake Face Swapping

Tianyi Wang, Shuaicheng Niu, Harry Cheng et al.

ICCV 2025posterarXiv:2503.18678
8
citations
#3509

MammAlps: A Multi-view Video Behavior Monitoring Dataset of Wild Mammals in the Swiss Alps

Valentin Gabeff, Haozhe Qi, Brendan Flaherty et al.

CVPR 2025highlightarXiv:2503.18223
8
citations
#3510

Multi-clue Consistency Learning to Bridge Gaps Between General and Oriented Object in Semi-supervised Detection

Chenxu Wang, Chunyan Xu, Xiang Li et al.

AAAI 2025paperarXiv:2407.05909
8
citations
#3511

RealisHuman: A Two-Stage Approach for Refining Malformed Human Parts in Generated Images

Benzhi Wang, Jingkai Zhou, Jingqi Bai et al.

AAAI 2025paperarXiv:2409.03644
8
citations
#3512

LIFe-GoM: Generalizable Human Rendering with Learned Iterative Feedback Over Multi-Resolution Gaussians-on-Mesh

Jing Wen, Alex Schwing, Shenlong Wang

ICLR 2025posterarXiv:2502.09617
8
citations
#3513

Secant Line Search for Frank-Wolfe Algorithms

Deborah Hendrych, Sebastian Pokutta, Mathieu Besançon et al.

ICML 2025posterarXiv:2501.18775
8
citations
#3514

Combining Cost Constrained Runtime Monitors for AI Safety

Tim Hua, James Baskerville, Henri Lemoine et al.

NEURIPS 2025posterarXiv:2507.15886
8
citations
#3515

GTR: Guided Thought Reinforcement Prevents Thought Collapse in RL-based VLM Agent Training

Tong Wei, Yijun Yang, Junliang Xing et al.

ICCV 2025posterarXiv:2503.08525
8
citations
#3516

Effective and Efficient Masked Image Generation Models

Zebin You, Jingyang Ou, Xiaolu Zhang et al.

ICML 2025posterarXiv:2503.07197
8
citations
#3517

DIPO: Dual-State Images Controlled Articulated Object Generation Powered by Diverse Data

Ruiqi Wu, Xinjie wang, Liu.Liu et al.

NEURIPS 2025posterarXiv:2505.20460
8
citations
#3518

Modality-Specialized Synergizers for Interleaved Vision-Language Generalists

Zhiyang Xu, Minqian Liu, Ying Shen et al.

ICLR 2025posterarXiv:2407.03604
8
citations
#3519

Fast and Slow Streams for Online Time Series Forecasting Without Information Leakage

Ying-yee Ava Lau, Zhiwen Shao, Dit-Yan Yeung

ICLR 2025oral
8
citations
#3520

Learning Chaos In A Linear Way

Xiaoyuan Cheng, Yi He, Yiming Yang et al.

ICLR 2025posterarXiv:2503.14702
8
citations
#3521

Do Visual Imaginations Improve Vision-and-Language Navigation Agents?

Akhil Perincherry, Jacob Krantz, Stefan Lee

CVPR 2025posterarXiv:2503.16394
8
citations
#3522

PhysSplat: Efficient Physics Simulation for 3D Scenes via MLLM-Guided Gaussian Splatting

Haoyu Zhao, Hao Wang, Xingyue Zhao et al.

ICCV 2025poster
8
citations
#3523

Data Taggants: Dataset Ownership Verification Via Harmless Targeted Data Poisoning

Wassim Bouaziz, Nicolas Usunier, El-Mahdi El-Mhamdi

ICLR 2025posterarXiv:2410.09101
8
citations
#3524

ZIM: Zero-Shot Image Matting for Anything

Beomyoung Kim, Chanyong Shin, Joonhyun Jeong et al.

ICCV 2025highlightarXiv:2411.00626
8
citations
#3525

Temporal Heterogeneous Graph Generation with Privacy, Utility, and Efficiency

Xinyu He, Dongqi Fu, Hanghang Tong et al.

ICLR 2025oral
8
citations
#3526

Autonomous LLM-Enhanced Adversarial Attack for Text-to-Motion

Honglei Miao, Fan Ma, Ruijie Quan et al.

AAAI 2025paperarXiv:2408.00352
8
citations
#3527

Soft Reasoning: Navigating Solution Spaces in Large Language Models through Controlled Embedding Exploration

Qinglin Zhu, Runcong Zhao, Hanqi Yan et al.

ICML 2025spotlightarXiv:2505.24688
8
citations
#3528

Gradient descent with generalized Newton’s method

Zhiqi Bu, Shiyun Xu

ICLR 2025posterarXiv:2407.02772
8
citations
#3529

Bundle Neural Network for message diffusion on graphs

Jacob Bamberger, Federico Barbero, Xiaowen Dong et al.

ICLR 2025posterarXiv:2405.15540
8
citations
#3530

Differentially Private Steering for Large Language Model Alignment

Anmol Goel, Yaxi Hu, Iryna Gurevych et al.

ICLR 2025posterarXiv:2501.18532
8
citations
#3531

AniMer: Animal Pose and Shape Estimation Using Family Aware Transformer

Jin Lyu, Tianyi Zhu, Yi Gu et al.

CVPR 2025posterarXiv:2412.00837
8
citations
#3532

DataMan: Data Manager for Pre-training Large Language Models

Ru Peng, Kexin Yang, Yawen Zeng et al.

ICLR 2025posterarXiv:2502.19363
8
citations
#3533

Can Textual Gradient Work in Federated Learning?

Minghui Chen, Ruinan Jin, Wenlong Deng et al.

ICLR 2025posterarXiv:2502.19980
8
citations
#3534

DiC: Rethinking Conv3x3 Designs in Diffusion Models

Yuchuan Tian, Jing Han, Chengcheng Wang et al.

CVPR 2025posterarXiv:2501.00603
8
citations
#3535

CapeX: Category-Agnostic Pose Estimation from Textual Point Explanation

Matan Rusanovsky, Or Hirschorn, Shai Avidan

ICLR 2025posterarXiv:2406.00384
8
citations
#3536

HC-LLM: Historical-Constrained Large Language Models for Radiology Report Generation

Tengfei Liu, Jiapu Wang, Yongli Hu et al.

AAAI 2025paperarXiv:2412.11070
8
citations
#3537

ViG: Linear-complexity Visual Sequence Learning with Gated Linear Attention

Bencheng Liao, Xinggang Wang, Lianghui Zhu et al.

AAAI 2025paperarXiv:2405.18425
8
citations
#3538

LangSplatV2: High-dimensional 3D Language Gaussian Splatting with 450+ FPS

Wanhua Li, Yujie Zhao, Minghan Qin et al.

NEURIPS 2025posterarXiv:2507.07136
8
citations
#3539

SfM-Free 3D Gaussian Splatting via Hierarchical Training

Bo Ji, Angela Yao

CVPR 2025posterarXiv:2412.01553
8
citations
#3540

Adversarial Generative Flow Network for Solving Vehicle Routing Problems

Ni Zhang, Jingfeng Yang, Zhiguang Cao et al.

ICLR 2025posterarXiv:2503.01931
8
citations
#3541

Incomplete Modality Disentangled Representation for Ophthalmic Disease Grading and Diagnosis

Chengzhi Liu, Zile Huang, Zhe Chen et al.

AAAI 2025paperarXiv:2502.11724
8
citations
#3542

GaussMark: A Practical Approach for Structural Watermarking of Language Models

Adam Block, Alexander Rakhlin, Ayush Sekhari

ICML 2025posterarXiv:2501.13941
8
citations
#3543

Efficient Fine-Tuning and Concept Suppression for Pruned Diffusion Models

Reza Shirkavand, Peiran Yu, Shangqian Gao et al.

CVPR 2025posterarXiv:2412.15341
8
citations
#3544

DenseGrounding: Improving Dense Language-Vision Semantics for Ego-centric 3D Visual Grounding

Henry Zheng, Hao Shi, Qihang Peng et al.

ICLR 2025posterarXiv:2505.04965
8
citations
#3545

SceneWeaver: All-in-One 3D Scene Synthesis with an Extensible and Self-Reflective Agent

Yandan Yang, Baoxiong Jia, Shujie Zhang et al.

NEURIPS 2025posterarXiv:2509.20414
8
citations
#3546

ROUTE: Robust Multitask Tuning and Collaboration for Text-to-SQL

Yang Qin, Chao Chen, Zhihang Fu et al.

ICLR 2025posterarXiv:2412.10138
8
citations
#3547

Local-Prompt: Extensible Local Prompts for Few-Shot Out-of-Distribution Detection

Fanhu Zeng, Zhen Cheng, Fei Zhu et al.

ICLR 2025posterarXiv:2409.04796
8
citations
#3548

Micro-macro Wavelet-based Gaussian Splatting for 3D Reconstruction from Unconstrained Images

Yihui Li, Chengxin Lv, Hongyu Yang et al.

AAAI 2025paperarXiv:2501.14231
8
citations
#3549

Instruction-based Image Manipulation by Watching How Things Move

Mingdeng Cao, Xuaner Zhang, Yinqiang Zheng et al.

CVPR 2025highlightarXiv:2412.12087
8
citations
#3550

MergeNet: Knowledge Migration Across Heterogeneous Models, Tasks, and Modalities

Kunxi Li, Tianyu Zhan, Kairui Fu et al.

AAAI 2025paperarXiv:2404.13322
8
citations
#3551

DORNet: A Degradation Oriented and Regularized Network for Blind Depth Super-Resolution

Zhengxue Wang, Zhiqiang Yan, Jinshan Pan et al.

CVPR 2025posterarXiv:2410.11666
8
citations
#3552

Efficient stagewise pretraining via progressive subnetworks

Abhishek Panigrahi, Nikunj Saunshi, Kaifeng Lyu et al.

ICLR 2025posterarXiv:2402.05913
8
citations
#3553

Feature Denoising Diffusion Model for Blind Image Quality Assessment

Xudong Li, Yan Zhang, Yunhang Shen et al.

AAAI 2025paperarXiv:2401.11949
8
citations
#3554

ForestFormer3D: A Unified Framework for End-to-End Segmentation of Forest LiDAR 3D Point Clouds

Binbin Xiang, Maciej Wielgosz, Stefano Puliti et al.

ICCV 2025posterarXiv:2506.16991
8
citations
#3555

VidChain: Chain-of-Tasks with Metric-based Direct Preference Optimization for Dense Video Captioning

Ji Soo Lee, Jongha Kim, Jeehye Na et al.

AAAI 2025paperarXiv:2501.06761
8
citations
#3556

vesselFM: A Foundation Model for Universal 3D Blood Vessel Segmentation

Bastian Wittmann, Yannick Wattenberg, Tamaz Amiranashvili et al.

CVPR 2025posterarXiv:2411.17386
8
citations
#3557

Rashomon Sets for Prototypical-Part Networks: Editing Interpretable Models in Real-Time

Jon Donnelly, Zhicheng Guo, Alina Jade Barnett et al.

CVPR 2025posterarXiv:2503.01087
8
citations
#3558

Beyond Sequence: Impact of Geometric Context for RNA Property Prediction

Junjie Xu, Artem Moskalev, Tommaso Mansi et al.

ICLR 2025posterarXiv:2410.11933
8
citations
#3559

Self-Cross Diffusion Guidance for Text-to-Image Synthesis of Similar Subjects

Weimin Qiu, Jieke Wang, Meng Tang

CVPR 2025posterarXiv:2411.18936
8
citations
#3560

AFL: A Single-Round Analytic Approach for Federated Learning with Pre-trained Models

Run He, Kai Tong, Di Fang et al.

CVPR 2025posterarXiv:2405.16240
8
citations
#3561

Right Now, Wrong Then: Non-Stationary Direct Preference Optimization under Preference Drift

Seongho Son, William Bankes, Sayak Ray Chowdhury et al.

ICML 2025oralarXiv:2407.18676
8
citations
#3562

EdgeTAM: On-Device Track Anything Model

Chong Zhou, Chenchen Zhu, Yunyang Xiong et al.

CVPR 2025posterarXiv:2501.07256
8
citations
#3563

Scaling Laws for Gradient Descent and Sign Descent for Linear Bigram Models under Zipf’s Law

Frederik Kunstner, Francis Bach

NEURIPS 2025posterarXiv:2505.19227
8
citations
#3564

PhysX-3D: Physical-Grounded 3D Asset Generation

Ziang Cao, Zhaoxi Chen, Liang Pan et al.

NEURIPS 2025spotlightarXiv:2507.12465
8
citations
#3565

LoR-VP: Low-Rank Visual Prompting for Efficient Vision Model Adaptation

Can Jin, Ying Li, Mingyu Zhao et al.

ICLR 2025posterarXiv:2502.00896
8
citations
#3566

Near, far: Patch-ordering enhances vision foundation models' scene understanding

Valentinos Pariza, Mohammadreza Salehi, Gertjan J Burghouts et al.

ICLR 2025posterarXiv:2408.11054
8
citations
#3567

Decoding Game: On Minimax Optimality of Heuristic Text Generation Strategies

Sijin Chen, Omar Hagrass, Jason Klusowski

ICLR 2025posterarXiv:2410.03968
8
citations
#3568

Energy Matching: Unifying Flow Matching and Energy-Based Models for Generative Modeling

Michal Balcerak, Tamaz Amiranashvili, Antonio Terpin et al.

NEURIPS 2025posterarXiv:2504.10612
8
citations
#3569

DataEnvGym: Data Generation Agents in Teacher Environments with Student Feedback

Zaid Khan, Elias Stengel-Eskin, Jaemin Cho et al.

ICLR 2025posterarXiv:2410.06215
8
citations
#3570

On the Expressiveness of Rational ReLU Neural Networks With Bounded Depth

Gennadiy Averkov, Christopher Hojny, Maximilian Merkert

ICLR 2025posterarXiv:2502.06283
8
citations
#3571

UltraFusion: Ultra High Dynamic Imaging using Exposure Fusion

Zixuan Chen, Yujin Wang, Xin Cai et al.

CVPR 2025highlightarXiv:2501.11515
8
citations
#3572

CONTRA: Conformal Prediction Region via Normalizing Flow Transformation

Zhenhan FANG, Aixin Tan, Jian Huang

ICLR 2025poster
8
citations
#3573

StateSpaceDiffuser: Bringing Long Context to Diffusion World Models

Nedko Savov, Naser Kazemi, Deheng Zhang et al.

NEURIPS 2025oralarXiv:2505.22246
8
citations
#3574

Information-Driven Design of Imaging Systems

Henry Pinkard, Leyla Kabuli, Eric Markley et al.

NEURIPS 2025posterarXiv:2405.20559
8
citations
#3575

Chain-of-region: Visual Language Models Need Details for Diagram Analysis

Xue Li, Yiyou Sun, Wei Cheng et al.

ICLR 2025poster
8
citations
#3576

RAD: Region-Aware Diffusion Models for Image Inpainting

Sora Kim, Sungho Suh, Minsik Lee

CVPR 2025posterarXiv:2412.09191
8
citations
#3577

As large as it gets – Studying Infinitely Large Convolutions via Neural Implicit Frequency Filters

Margret Keuper, Julia Grabinski, Janis Keuper

ICLR 2025poster
8
citations
#3578

Show and Segment: Universal Medical Image Segmentation via In-Context Learning

Yunhe Gao, Di Liu, Zhuowei Li et al.

CVPR 2025posterarXiv:2503.19359
8
citations
#3579

Efficient Multi-modal Large Language Models via Progressive Consistency Distillation

Zichen Wen, Shaobo Wang, Yufa Zhou et al.

NEURIPS 2025posterarXiv:2510.00515
8
citations
#3580

GeoSplatting: Towards Geometry Guided Gaussian Splatting for Physically-based Inverse Rendering

Kai Ye, Chong Gao, Guanbin Li et al.

ICCV 2025posterarXiv:2410.24204
8
citations
#3581

MANTRA: The Manifold Triangulations Assemblage

Rubén Ballester, Ernst Roell, Daniel Bin Schmid et al.

ICLR 2025posterarXiv:2410.02392
8
citations
#3582

BANet: Bilateral Aggregation Network for Mobile Stereo Matching

Gangwei Xu, Jiaxin Liu, Xianqi Wang et al.

ICCV 2025posterarXiv:2503.03259
8
citations
#3583

Seeing Eye to AI: Human Alignment via Gaze-Based Response Rewards for Large Language Models

Ángela López-Cardona, Carlos Segura, Alexandros Karatzoglou et al.

ICLR 2025posterarXiv:2410.01532
8
citations
#3584

DynamicVL: Benchmarking Multimodal Large Language Models for Dynamic City Understanding

Weihao Xuan, Junjue Wang, Heli Qi et al.

NEURIPS 2025oralarXiv:2505.21076
8
citations
#3585

V2C-CBM: Building Concept Bottlenecks with Vision-to-Concept Tokenizer

Hangzhou He, Lei Zhu, Xinliang Zhang et al.

AAAI 2025paperarXiv:2501.04975
8
citations
#3586

Adaptive Unimodal Regulation for Balanced Multimodal Information Acquisition

Chengxiang Huang, Yake Wei, Zequn Yang et al.

CVPR 2025posterarXiv:2503.18595
8
citations
#3587

Q-SFT: Q-Learning for Language Models via Supervised Fine-Tuning

Joey Hong, Anca Dragan, Sergey Levine

ICLR 2025posterarXiv:2411.05193
8
citations
#3588

Orchid: Image Latent Diffusion for Joint Appearance and Geometry Generation

Akshay Krishnan, Xinchen Yan, Vincent Casser et al.

ICCV 2025posterarXiv:2501.13087
8
citations
#3589

Quantum-PEFT: Ultra parameter-efficient fine-tuning

Toshiaki Koike-Akino, Francesco Tonin, Yongtao Wu et al.

ICLR 2025posterarXiv:2503.05431
8
citations
#3590

Sparse2DGS: Geometry-Prioritized Gaussian Splatting for Surface Reconstruction from Sparse Views

Jiang Wu, Rui Li, Yu Zhu et al.

CVPR 2025posterarXiv:2504.20378
8
citations
#3591

SAM-R1: Leveraging SAM for Reward Feedback in Multimodal Segmentation via Reinforcement Learning

Jiaqi Huang, Zunnan Xu, Jun Zhou et al.

NEURIPS 2025posterarXiv:2505.22596
8
citations
#3592

TopoDiffusionNet: A Topology-aware Diffusion Model

Saumya Gupta, Dimitris Samaras, Chao Chen

ICLR 2025posterarXiv:2410.16646
8
citations
#3593

SmartRAG: Jointly Learn RAG-Related Tasks From the Environment Feedback

Jingsheng Gao, Linxu Li, Ke Ji et al.

ICLR 2025posterarXiv:2410.18141
8
citations
#3594

Joint Out-of-Distribution Filtering and Data Discovery Active Learning

Sebastian Schmidt, Leonard Schenk, Leo Schwinn et al.

CVPR 2025posterarXiv:2503.02491
8
citations
#3595

Generative Zero-Shot Composed Image Retrieval

Lan Wang, Wei Ao, Vishnu Naresh Boddeti et al.

CVPR 2025poster
8
citations
#3596

Not all solutions are created equal: An analytical dissociation of functional and representational similarity in deep linear neural networks

Lukas Braun, Erin Grant, Andrew Saxe

ICML 2025spotlight
8
citations
#3597

Interactive Speculative Planning: Enhance Agent Efficiency through Co-design of System and User Interface

Wenyue Hua, Mengting Wan, JAGANNATH VADREVU et al.

ICLR 2025posterarXiv:2410.00079
8
citations
#3598

Multi-Granular Multimodal Clue Fusion for Meme Understanding

Li Zheng, Hao Fei, Ting Dai et al.

AAAI 2025paperarXiv:2503.12560
8
citations
#3599

Scaling Embedding Layers in Language Models

Da Yu, Edith Cohen, Badih Ghazi et al.

NEURIPS 2025posterarXiv:2502.01637
8
citations
#3600

Whole-Body Conditioned Egocentric Video Prediction

Yutong Bai, Danny Tran, Amir Bar et al.

NEURIPS 2025posterarXiv:2506.21552
8
citations