Most Cited 2025 "latent dimension alignment" Papers

22,274 papers found • Page 23 of 112

Filters:Most Cited 2025 latent dimension alignment Clear all

Conference

AAAI 2025 (3,028)COLM 2025 (418)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NEURIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,558)oral (1,594)spotlight (1,421)highlight (975)

#4401

FlipSketch: Flipping Static Drawings to Text-Guided Sketch Animations

Hmrishav Bandyopadhyay, Yi-Zhe Song

CVPR 2025arXiv:2411.10818

citations

#4402

RANGE: Retrieval Augmented Neural Fields for Multi-Resolution Geo-Embeddings

Aayush Dhakal, Srikumar Sastry, Subash Khanal et al.

CVPR 2025arXiv:2502.19781

citations

#4403

DiffRetouch: Using Diffusion to Retouch on the Shoulder of Experts

Zheng-Peng Duan, Jiawei Zhang, Zheng Lin et al.

AAAI 2025paperarXiv:2407.03757

citations

#4404

Auto-Regressive Diffusion for Generating 3D Human-Object Interactions

Zichen Geng, Zeeshan Hayder, Wei Liu et al.

AAAI 2025paperarXiv:2503.16801

citations

#4405

Revisiting a Design Choice in Gradient Temporal Difference Learning

Xiaochi Qian, Shangtong Zhang

ICLR 2025oralarXiv:2308.01170

citations

#4406

Autocorrelation Matters: Understanding the Role of Initialization Schemes for State Space Models

Fusheng Liu, Qianxiao Li

ICLR 2025oralarXiv:2411.19455

citations

#4407

Augmented Deep Contexts for Spatially Embedded Video Coding

Yifan Bian, Chuanbo Tang, Li Li et al.

CVPR 2025highlightarXiv:2505.05309

citations

#4408

QT-DoG: Quantization-Aware Training for Domain Generalization

Saqib Javed, Hieu Le, Mathieu Salzmann

ICML 2025arXiv:2410.06020

citations

#4409

PEACE: Empowering Geologic Map Holistic Understanding with MLLMs

Yangyu Huang, Tianyi Gao, Haoran Xu et al.

CVPR 2025arXiv:2501.06184

citations

#4410

Position: The Artificial Intelligence and Machine Learning Community Should Adopt a More Transparent and Regulated Peer Review Process

Jing Yang

ICML 2025arXiv:2502.00874

citations

#4411

Unisolver: PDE-Conditional Transformers Towards Universal Neural PDE Solvers

Hang Zhou, Yuezhou Ma, Haixu Wu et al.

ICML 2025arXiv:2405.17527

citations

#4412

Toward Efficient Kernel-Based Solvers for Nonlinear PDEs

Zhitong Xu, Da Long, Yiming Xu et al.

ICML 2025arXiv:2410.11165

citations

#4413

H-MBA: Hierarchical MamBa Adaptation for Multi-Modal Video Understanding in Autonomous Driving

Siran Chen, Yuxiao Luo, Yue Ma et al.

AAAI 2025paperarXiv:2501.04302

citations

#4414

Large Language Models Think Too Fast To Explore Effectively

Lan Pan, Hanbo Xie, Robert Wilson

NEURIPS 2025arXiv:2501.18009

citations

#4415

Stable-SCore: A Stable Registration-based Framework for 3D Shape Correspondence

Haolin Liu, Xiaohang Zhan, Zizheng Yan et al.

CVPR 2025arXiv:2503.21766

citations

#4416

MindSimulator: Exploring Brain Concept Localization via Synthetic fMRI

Qi Zhang, Qi Zhang, Zixuan Gong et al.

ICLR 2025arXiv:2503.02351

citations

#4417

Uniform Generalization Bounds on Data-Dependent Hypothesis Sets via PAC-Bayesian Theory on Random Sets

Benjamin Dupuis, Paul Viallard, George Deligiannidis et al.

NEURIPS 2025arXiv:2404.17442

citations

#4418

Understanding and Mitigating Memorization in Diffusion Models for Tabular Data

Zhengyu Fang, Zhimeng Jiang, Huiyuan Chen et al.

ICML 2025arXiv:2412.11044

citations

#4419

RealEdit: Reddit Edits As a Large-scale Empirical Dataset for Image Transformations

Peter Sushko, Ayana Bharadwaj, Zhi Yang Lim et al.

CVPR 2025arXiv:2502.03629

citations

#4420

The emergence of sparse attention: impact of data distribution and benefits of repetition

Nicolas Zucchet, Francesco D'Angelo, Andrew Lampinen et al.

NEURIPS 2025oralarXiv:2505.17863

citations

#4421

Truth over Tricks: Measuring and Mitigating Shortcut Learning in Misinformation Detection

Herun Wan, Jiaying Wu, Minnan Luo et al.

NEURIPS 2025arXiv:2506.02350

citations

#4422

Task Generalization with Autoregressive Compositional Structure: Can Learning from $D$ Tasks Generalize to $D^T$ Tasks?

Amirhesam Abedsoltan, Huaqing Zhang, Kaiyue Wen et al.

ICML 2025arXiv:2502.08991

citations

#4423

FlowCut: Rethinking Redundancy via Information Flow for Efficient Vision-Language Models

Jintao Tong, Wenwei Jin, Pengda Qin et al.

NEURIPS 2025arXiv:2505.19536

citations

#4424

DualPM: Dual Posed-Canonical Point Maps for 3D Shape and Pose Reconstruction

Ben Kaye, Tomas Jakab, Shangzhe Wu et al.

CVPR 2025highlightarXiv:2412.04464

citations

#4425

Token Perturbation Guidance for Diffusion Models

Javad Rajabi, Soroush Mehraban, Seyedmorteza Sadat et al.

NEURIPS 2025arXiv:2506.10036

citations

#4426

WaterDiffusion: Learning a Prior-involved Unrolling Diffusion for Joint Underwater Saliency Detection and Visual Restoration

Laibin Chang, Yunke Wang, Longxiang Deng et al.

AAAI 2025paper

citations

#4427

Generative Sparse-View Gaussian Splatting

Hanyang Kong, Xingyi Yang, Xinchao Wang

CVPR 2025

citations

#4428

ProtoArgNet: Interpretable Image Classification with Super-Prototypes and Argumentation

Hamed Ayoobi, Nico Potyka, Francesca Toni

AAAI 2025paperarXiv:2311.15438

citations

#4429

PROXSPARSE: REGULARIZED LEARNING OF SEMI-STRUCTURED SPARSITY MASKS FOR PRETRAINED LLMS

Hongyi Liu, Rajarshi Saha, Zhen Jia et al.

ICML 2025arXiv:2502.00258

citations

#4430

Exploit Your Latents: Coarse-Grained Protein Backmapping with Latent Diffusion Models

Rongchao Zhang, Yu Huang, Yiwei Lou et al.

AAAI 2025paper

citations

#4431

StyleSSP: Sampling StartPoint Enhancement for Training-free Diffusion-based Method for Style Transfer

ruojun xu, Weijie Xi, Xiaodi Wang et al.

CVPR 2025highlightarXiv:2501.11319

citations

#4432

Robust Federated Finetuning of LLMs via Alternating Optimization of LoRA

Shuangyi Chen, Yuanxin Guo, Yue Ju et al.

NEURIPS 2025arXiv:2502.01755

citations

#4433

4DTAM: Non-Rigid Tracking and Mapping via Dynamic Surface Gaussians

Hidenobu Matsuki, Gwangbin Bae, Andrew J. Davison

CVPR 2025arXiv:2505.22859

citations

#4434

DEALing with Image Reconstruction: Deep Attentive Least Squares

Mehrsa Pourya, Erich Kobler, Michael Unser et al.

ICML 2025arXiv:2502.04079

citations

#4435

Manipulating Feature Visualizations with Gradient Slingshots

Dilyara Bareeva, Marina Höhne, Alexander Warnecke et al.

NEURIPS 2025arXiv:2401.06122

citations

#4436

TurboFill: Adapting Few-step Text-to-image Model for Fast Image Inpainting

Liangbin Xie, Daniil Pakhomov, Zhonghao Wang et al.

CVPR 2025arXiv:2504.00996

citations

#4437

IMDPrompter: Adapting SAM to Image Manipulation Detection by Cross-View Automated Prompt Learning

Quan Zhang, Yuxin Qi, Xi Tang et al.

ICLR 2025arXiv:2502.02454

citations

#4438

Flowing Datasets with Wasserstein over Wasserstein Gradient Flows

Clément Bonet, Christophe Vauthier, Anna Korba

ICML 2025oralarXiv:2506.07534

citations

#4439

Afterburner: Reinforcement Learning Facilitates Self-Improving Code Efficiency Optimization

Mingzhe Du, Anh Tuan Luu, Yue Liu et al.

NEURIPS 2025arXiv:2505.23387

citations

#4440

PhysAug: A Physical-guided and Frequency-based Data Augmentation for Single-Domain Generalized Object Detection

Xiaoran Xu, Jiangang Yang, Wenhui Shi et al.

AAAI 2025paperarXiv:2412.11807

citations

#4441

GNNs Getting ComFy: Community and Feature Similarity Guided Rewiring

Celia Rubio-Madrigal, Adarsh Jamadandi, Rebekka Burkholz

ICLR 2025arXiv:2502.04891

citations

#4442

Cached Multi-Lora Composition for Multi-Concept Image Generation

Xiandong Zou, Mingzhu Shen, Christos-Savvas Bouganis et al.

ICLR 2025arXiv:2502.04923

citations

#4443

MOL-Mamba: Enhancing Molecular Representation with Structural & Electronic Insights

Jingjing Hu, Dan Guo, Zhan Si et al.

AAAI 2025paperarXiv:2412.16483

citations

#4444

Bi-level Contrastive Learning for Knowledge-Enhanced Molecule Representations

Pengcheng Jiang, Cao Xiao, Tianfan Fu et al.

AAAI 2025paperarXiv:2306.01631

citations

#4445

StableCodec: Taming One-Step Diffusion for Extreme Image Compression

Tianyu Zhang, Xin Luo, Li Li et al.

ICCV 2025arXiv:2506.21977

citations

#4446

BrainOOD: Out-of-distribution Generalizable Brain Network Analysis

Jiaxing Xu, Yongqiang Chen, Xia Dong et al.

ICLR 2025arXiv:2502.01688

citations

#4447

Hypergraph Attacks via Injecting Homogeneous Nodes into Elite Hyperedges

Meixia He, Peican Zhu, Keke Tang et al.

AAAI 2025paperarXiv:2412.18365

citations

#4448

Are Expressive Models Truly Necessary for Offline RL?

Guan Wang, Haoyi Niu, Jianxiong Li et al.

AAAI 2025paperarXiv:2412.11253

citations

#4449

Multi-modal Knowledge Distillation-based Human Trajectory Forecasting

Jaewoo Jeong, Seohee Lee, Daehee Park et al.

CVPR 2025arXiv:2503.22201

citations

#4450

CSformer: Combining Channel Independence and Mixing for Robust Multivariate Time Series Forecasting

Haoxin Wang, Yipeng Mo, Kunlan Xiang et al.

AAAI 2025paperarXiv:2312.06220

citations

#4451

FluxSpace: Disentangled Semantic Editing in Rectified Flow Models

Yusuf Dalva, Kavana Venkatesh, Pinar Yanardag

CVPR 2025

citations

#4452

DualCP: Rehearsal-Free Domain-Incremental Learning via Dual-Level Concept Prototype

Qiang Wang, Yuhang He, Songlin Dong et al.

AAAI 2025paperarXiv:2503.18042

citations

#4453

End-to-end Learning of Gaussian Mixture Priors for Diffusion Sampler

Denis Blessing, Xiaogang Jia, Gerhard Neumann

ICLR 2025arXiv:2503.00524

citations

#4454

SDGOCC: Semantic and Depth-Guided Bird's-Eye View Transformation for 3D Multimodal Occupancy Prediction

ZaiPeng Duan, Xuzhong Hu, Pei An et al.

CVPR 2025arXiv:2507.17083

citations

#4455

Rethinking the Embodied Gap in Vision-and-Language Navigation: A Holistic Study of Physical and Visual Disparities

Liuyi Wang, Xinyuan Xia, Hui Zhao et al.

ICCV 2025arXiv:2507.13019

citations

#4456

Locally Convex Global Loss Network for Decision-Focused Learning

Haeun Jeon, Hyunglip Bae, Minsu Park et al.

AAAI 2025paperarXiv:2403.01875

citations

#4457

Decoder-Hybrid-Decoder Architecture for Efficient Reasoning with Long Generation

Liliang Ren, Congcong Chen, Haoran Xu et al.

NEURIPS 2025arXiv:2507.06607

citations

#4458

Utilize the Flow Before Stepping into the Same River Twice: Certainty Represented Knowledge Flow for Refusal-Aware Instruction Tuning

Runchuan Zhu, Zhipeng Ma, Jiang Wu et al.

AAAI 2025paperarXiv:2410.06913

citations

#4459

Expressivity of Neural Networks with Random Weights and Learned Biases

Ezekiel Williams, Alexandre Payeur, Avery Ryoo et al.

ICLR 2025arXiv:2407.00957

citations

#4460

Maximizing the Potential of Synthetic Data: Insights from Random Matrix Theory

Aymane El Firdoussi, Mohamed El Amine Seddik, Soufiane Hayou et al.

ICLR 2025arXiv:2410.08942

citations

#4461

AdaFisher: Adaptive Second Order Optimization via Fisher Information

Damien GOMES, Yanlei Zhang, Eugene Belilovsky et al.

ICLR 2025arXiv:2405.16397

citations

#4462

T-JEPA: Augmentation-Free Self-Supervised Learning for Tabular Data

Hugo Thimonier, José Lucas De Melo Costa, Fabrice Popineau et al.

ICLR 2025arXiv:2410.05016

citations

#4463

COAP: Memory-Efficient Training with Correlation-Aware Gradient Projection

Jinqi Xiao, Shen Sang, Tiancheng Zhi et al.

CVPR 2025arXiv:2412.00071

citations

#4464

LVFace: Progressive Cluster Optimization for Large Vision Models in Face Recognition

Jinghan You, Shanglin Li, Yuanrui Sun et al.

ICCV 2025highlightarXiv:2501.13420

citations

#4465

AerialVG: A Challenging Benchmark for Aerial Visual Grounding by Exploring Positional Relations

Junli Liu, Qizhi Chen, Zhigang Wang et al.

ICCV 2025arXiv:2504.07836

citations

#4466

Feature Responsiveness Scores: Model-Agnostic Explanations for Recourse

Seung Hyun Cheon, Anneke Wernerfelt, Sorelle Friedler et al.

ICLR 2025arXiv:2410.22598

citations

#4467

6D Object Pose Tracking in Internet Videos for Robotic Manipulation

Georgy Ponimatkin, Martin Cífka, Tomas Soucek et al.

ICLR 2025oralarXiv:2503.10307

citations

#4468

CODA: Repurposing Continuous VAEs for Discrete Tokenization

Zeyu Liu, Zanlin Ni, Yeguo Hua et al.

ICCV 2025arXiv:2503.17760

citations

#4469

Aligning Language Models Using Follow-up Likelihood as Reward Signal

Chen Zhang, Dading Chong, Feng Jiang et al.

AAAI 2025paperarXiv:2409.13948

citations

#4470

Runtime Analysis for Multi-Objective Evolutionary Algorithms in Unbounded Integer Spaces

Benjamin Doerr, Martin S. Krejca, Günter Rudolph

AAAI 2025paperarXiv:2412.11684

citations

#4471

R-LiViT: A LiDAR-Visual-Thermal Dataset Enabling Vulnerable Road User Focused Roadside Perception

Jonas Mirlach, Lei Wan, Andreas Wiedholz et al.

ICCV 2025arXiv:2503.17122

citations

#4472

ReDit: Reward Dithering for Improved LLM Policy Optimization

Chenxing Wei, Jiarui Yu, Ying He et al.

NEURIPS 2025arXiv:2506.18631

citations

#4473

Stable Hadamard Memory: Revitalizing Memory-Augmented Agents for Reinforcement Learning

Hung Le, Dung Nguyen, Kien Do et al.

ICLR 2025arXiv:2410.10132

citations

#4474

DeRIS: Decoupling Perception and Cognition for Enhanced Referring Image Segmentation through Loopback Synergy

Ming Dai, Wenxuan Cheng, Jiang-Jiang Liu et al.

ICCV 2025arXiv:2507.01738

citations

#4475

CATSplat: Context-Aware Transformer with Spatial Guidance for Generalizable 3D Gaussian Splatting from A Single-View Image

Wonseok Roh, Hwanhee Jung, JongWook Kim et al.

ICCV 2025arXiv:2412.12906

citations

#4476

ZeroStereo: Zero-shot Stereo Matching from Single Images

Xianqi Wang, Hao Yang, Gangwei Xu et al.

ICCV 2025arXiv:2501.08654

citations

#4477

Train Small, Infer Large: Memory-Efficient LoRA Training for Large Language Models

Jun Zhang, Jue Wang, Huan Li et al.

ICLR 2025arXiv:2502.13533

citations

#4478

Cross-modal Ship Re-Identification via Optical and SAR Imagery: A Novel Dataset and Method

Han Wang, Shengyang Li, Jian Yang et al.

ICCV 2025arXiv:2506.22027

citations

#4479

How Do I Do That? Synthesizing 3D Hand Motion and Contacts for Everyday Interactions

Aditya Prakash, Benjamin E Lundell, Dmitry Andreychuk et al.

CVPR 2025highlightarXiv:2504.12284

citations

#4480

GroupMamba: Efficient Group-Based Visual State Space Model

Abdelrahman Shaker, Syed Talal Wasim, Salman Khan et al.

CVPR 2025arXiv:2407.13772

citations

#4481

CLIPDrag: Combining Text-based and Drag-based Instructions for Image Editing

Ziqi Jiang, Zhen Wang, Long Chen

ICLR 2025arXiv:2410.03097

citations

#4482

UV-Attack: Physical-World Adversarial Attacks on Person Detection via Dynamic-NeRF-based UV Mapping

Yanjie Li, Kaisheng Liang, Bin Xiao

ICLR 2025arXiv:2501.05783

citations

#4483

RadarSplat: Radar Gaussian Splatting for High-Fidelity Data Synthesis and 3D Reconstruction of Autonomous Driving Scenes

Pou-Chun Kung, Skanda Harisha, Ram Vasudevan et al.

ICCV 2025arXiv:2506.01379

citations

#4484

Enhancing Reward Models for High-quality Image Generation: Beyond Text-Image Alignment

ying ba, Tianyu Zhang, Yalong Bai et al.

ICCV 2025arXiv:2507.19002

citations

#4485

3D Occupancy Prediction with Low-Resolution Queries via Prototype-aware View Transformation

Gyeongrok Oh, Sung June Kim, Heeju Ko et al.

CVPR 2025arXiv:2503.15185

citations

#4486

Enhancing Text-to-Image Diffusion Transformer via Split-Text Conditioning

Yu Zhang, Jialei Zhou, Xinchen Li et al.

NEURIPS 2025arXiv:2505.19261

citations

#4487

Unleashing the Potential of Vision-Language Pre-Training for 3D Zero-Shot Lesion Segmentation via Mask-Attribute Alignment

Yankai Jiang, Wenhui Lei, Xiaofan Zhang et al.

ICLR 2025arXiv:2410.15744

citations

#4488

Learning with Calibration: Exploring Test-Time Computing of Spatio-Temporal Forecasting

Wei Chen, Yuxuan Liang

NEURIPS 2025oralarXiv:2506.00635

citations

#4489

SUMI-IFL: An Information-Theoretic Framework for Image Forgery Localization with Sufficiency and Minimality Constraints

Ziqi Sheng, Wei Lu, Xiangyang Luo et al.

AAAI 2025paperarXiv:2412.09981

citations

#4490

MobileIE: An Extremely Lightweight and Effective ConvNet for Real-Time Image Enhancement on Mobile Devices

HAILONG YAN, Ao Li, Xiangtao Zhang et al.

ICCV 2025arXiv:2507.01838

citations

#4491

From Easy to Hard: Progressive Active Learning Framework for Infrared Small Target Detection with Single Point Supervision

Chuang Yu, Jinmiao Zhao, Yunpeng Liu et al.

ICCV 2025arXiv:2412.11154

citations

#4492

SpatialSplat: Efficient Semantic 3D from Sparse Unposed Images

Yu Sheng, Jiajun Deng, Xinran Zhang et al.

ICCV 2025arXiv:2505.23044

citations

#4493

TurboReg: TurboClique for Robust and Efficient Point Cloud Registration

Shaocheng Yan, Pengcheng Shi, Zhenjun Zhao et al.

ICCV 2025arXiv:2507.01439

citations

#4494

Boosting Short Text Classification with Multi-Source Information Exploration and Dual-Level Contrastive Learning

Yonghao Liu, Mengyu Li, Wei Pang et al.

AAAI 2025paperarXiv:2501.09214

citations

#4495

DefectFill: Realistic Defect Generation with Inpainting Diffusion Model for Visual Inspection

Jaewoo Song, Daemin Park, Kanghyun Baek et al.

CVPR 2025highlightarXiv:2503.13985

citations

#4496

``Principal Components" Enable A New Language of Images

Xin Wen, Bingchen Zhao, Ismail Elezi et al.

ICCV 2025

citations

#4497

A Comprehensive Study of Decoder-Only LLMs for Text-to-Image Generation

Andrew Z Wang, Songwei Ge, Tero Karras et al.

CVPR 2025arXiv:2506.08210

citations

#4498

Bidirectional Decoding: Improving Action Chunking via Guided Test-Time Sampling

Yuejiang Liu, Jubayer Hamid, Annie Xie et al.

ICLR 2025oralarXiv:2408.17355

citations

#4499

SVIP: Semantically Contextualized Visual Patches for Zero-Shot Learning

Zhi Chen, Zecheng Zhao, Jingcai Guo et al.

ICCV 2025arXiv:2503.10252

citations

#4500

Task-level Distributionally Robust Optimization for Large Language Model-based Dense Retrieval

Guangyuan Ma, Yongliang Ma, Xing Wu et al.

AAAI 2025paperarXiv:2408.10613

citations

#4501

Robust Machine Unlearning for Quantized Neural Networks via Adaptive Gradient Reweighting with Similar Labels

Yujia Tong, Yuze Wang, Jingling Yuan et al.

ICCV 2025arXiv:2503.13917

citations

#4502

BF-STVSR: B-Splines and Fourier---Best Friends for High Fidelity Spatial-Temporal Video Super-Resolution

Eunjin Kim, HYEONJIN KIM, Kyong Hwan Jin et al.

CVPR 2025arXiv:2501.11043

citations

#4503

ReAL-AD: Towards Human-Like Reasoning in End-to-End Autonomous Driving

Yuhang Lu, Jiadong Tu, Yuexin Ma et al.

ICCV 2025arXiv:2507.12499

citations

#4504

Guiding Human-Object Interactions with Rich Geometry and Relations

Mengqing Xue, Yifei Liu, Ling Guo et al.

CVPR 2025arXiv:2503.20172

citations

#4505

Textured 3D Regenerative Morphing with 3D Diffusion Prior

Songlin Yang, Yushi LAN, Honghua Chen et al.

ICCV 2025arXiv:2502.14316

citations

#4506

Snakes and Ladders: Two Steps Up for VideoMamba

Hui Lu, Albert Ali Salah, Ronald Poppe

ICCV 2025arXiv:2406.19006

citations

#4507

1000+ FPS 4D Gaussian Splatting for Dynamic Scene Rendering

Yuheng Yuan, Qiuhong Shen, Xingyi Yang et al.

NEURIPS 2025oralarXiv:2503.16422

citations

#4508

Dual-Process Image Generation

Grace Luo, Jonathan Granskog, Aleksander Holynski et al.

ICCV 2025arXiv:2506.01955

citations

#4509

IDInit: A Universal and Stable Initialization Method for Neural Network Training

Yu Pan, Chaozheng Wang, Zekai Wu et al.

ICLR 2025arXiv:2503.04626

citations

#4510

Test3R: Learning to Reconstruct 3D at Test Time

Yuheng Yuan, Qiuhong Shen, Shizun Wang et al.

NEURIPS 2025arXiv:2506.13750

citations

#4511

Investigating the Pre-Training Dynamics of In-Context Learning: Task Recognition vs. Task Learning

Xiaolei Wang, Xinyu Tang, Junyi Li et al.

ICLR 2025arXiv:2406.14022

citations

#4512

Unbiased Region-Language Alignment for Open-Vocabulary Dense Prediction

Yunheng Li, Yuxuan Li, Quan-Sheng Zeng et al.

ICCV 2025arXiv:2412.06244

citations

#4513

Zero-resource Hallucination Detection for Text Generation via Graph-based Contextual Knowledge Triples Modeling

Xinyue Fang, Zhen Huang, Zhiliang Tian et al.

AAAI 2025paperarXiv:2409.11283

citations

#4514

Emulating Self-attention with Convolution for Efficient Image Super-Resolution

Dongheon Lee, Seokju Yun, Youngmin Ro

ICCV 2025highlightarXiv:2503.06671

citations

#4515

CAMEx: Curvature-aware Merging of Experts

Dung Viet Nguyen, Minh Nguyen, Luc Nguyen et al.

ICLR 2025arXiv:2502.18821

citations

#4516

REDUCIO! Generating 1K Video within 16 Seconds using Extremely Compressed Motion Latents

Rui Tian, Qi Dai, Jianmin Bao et al.

ICCV 2025arXiv:2411.13552

citations

#4517

Advantage Alignment Algorithms

Juan Duque, Milad Aghajohari, Timotheus Cooijmans et al.

ICLR 2025arXiv:2406.14662

citations

#4518

Tight Clusters Make Specialized Experts

Stefan Nielsen, Rachel Teo, Laziz Abdullaev et al.

ICLR 2025arXiv:2502.15315

citations

#4519

CoVoMix2: Advancing Zero-Shot Dialogue Generation with Fully Non-Autoregressive Flow Matching

Leying Zhang, Yao Qian, Xiaofei Wang et al.

NEURIPS 2025arXiv:2506.00885

citations

#4520

Beyond Content Relevance: Evaluating Instruction Following in Retrieval Models

Jianqun Zhou, Yuanlei Zheng, Wei Chen et al.

ICLR 2025arXiv:2410.23841

citations

#4521

Open-Vocabulary Octree-Graph for 3D Scene Understanding

Zhigang Wang, Yifei Su, Chenhui Li et al.

ICCV 2025arXiv:2411.16253

citations

#4522

SimXRD-4M: Big Simulated X-ray Diffraction Data and Crystal Symmetry Classification Benchmark

Bin Cao, Yang Liu, Zinan Zheng et al.

ICLR 2025

citations

#4523

CompCap: Improving Multimodal Large Language Models with Composite Captions

Xiaohui Chen, Satya Narayan Shukla, Mahmoud Azab et al.

ICCV 2025arXiv:2412.05243

citations

#4524

Bringing RNNs Back to Efficient Open-Ended Video Understanding

Weili Xu, Enxin Song, Wenhao Chai et al.

ICCV 2025arXiv:2507.02591

citations

#4525

OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene Understanding

Jingli Lin, Chenming Zhu, Runsen Xu et al.

NEURIPS 2025oralarXiv:2507.07984

citations

#4526

Enhancing Uncertainty Modeling with Semantic Graph for Hallucination Detection

Kedi Chen, Qin Chen, Jie Zhou et al.

AAAI 2025paperarXiv:2501.02020

citations

#4527

AlphaPre: Amplitude-Phase Disentanglement Model for Precipitation Nowcasting

Kenghong Lin, Baoquan Zhang, Demin Yu et al.

CVPR 2025

citations

#4528

Fused View-Time Attention and Feedforward Reconstruction for 4D Scene Generation

Chaoyang Wang, Ashkan Mirzaei, Vidit Goel et al.

NEURIPS 2025oralarXiv:2506.18839

citations

#4529

PartRM: Modeling Part-Level Dynamics with Large Cross-State Reconstruction Model

Mingju Gao, Yike Pan, Huan-ang Gao et al.

CVPR 2025arXiv:2503.19913

citations

#4530

AURELIA: Test-time Reasoning Distillation in Audio-Visual LLMs

Sanjoy Chowdhury, Hanan Gani, Nishit Anand et al.

ICCV 2025arXiv:2503.23219

citations

#4531

Harnessing Uncertainty-aware Bounding Boxes for Unsupervised 3D Object Detection

Ruiyang Zhang, Hu Zhang, Zhedong Zheng

ICCV 2025arXiv:2408.00619

citations

#4532

Bridging Training and Execution via Dynamic Directed Graph-Based Communication in Cooperative Multi-Agent Systems

Zhuohui Zhang, Bin He, Bin Cheng et al.

AAAI 2025paperarXiv:2408.07397

citations

#4533

Satellite Observations Guided Diffusion Model for Accurate Meteorological States at Arbitrary Resolution

Siwei Tu, Ben Fei, Weidong Yang et al.

CVPR 2025highlightarXiv:2502.07814

citations

#4534

Federated Learning with Domain Shift Eraser

Zheng Wang, Zihui Wang, Zheng Wang et al.

CVPR 2025arXiv:2503.13063

citations

#4535

Probabilistic Stability Guarantees for Feature Attributions

Helen Jin, Anton Xue, Weiqiu You et al.

NEURIPS 2025arXiv:2504.13787

citations

#4536

REVECA: Adaptive Planning and Trajectory-Based Validation in Cooperative Language Agents Using Information Relevance and Relative Proximity

SeungWon Seo, SeongRae Noh, Junhyeok Lee et al.

AAAI 2025paperarXiv:2405.16751

citations

#4537

Stealthy Shield Defense: A Conditional Mutual Information-Based Approach against Black-Box Model Inversion Attacks

Tianqu Zhuang, Hongyao Yu, Yixiang Qiu et al.

ICLR 2025

citations

#4538

MIEB: Massive Image Embedding Benchmark

Chenghao Xiao, Isaac Chung, Imene Kerboua et al.

ICCV 2025arXiv:2504.10471

citations

#4539

BadRobot: Jailbreaking Embodied LLM Agents in the Physical World

Hangtao Zhang, Chenyu Zhu, Xianlong Wang et al.

ICLR 2025

citations

#4540

Multi-Agent Motion Planning for Differential Drive Robots Through Stationary State Search

Jingtian Yan, Jiaoyang Li

AAAI 2025paperarXiv:2412.13359

citations

#4541

D^2iT: Dynamic Diffusion Transformer for Accurate Image Generation

Weinan Jia, Mengqi Huang, Nan Chen et al.

CVPR 2025

citations

#4542

On the Expressive Power of Sparse Geometric MPNNs

Yonatan Sverdlov, Nadav Dym

ICLR 2025arXiv:2407.02025

citations

#4543

Student-Informed Teacher Training

Nico Messikommer, Jiaxu Xing, Elie Aljalbout et al.

ICLR 2025arXiv:2412.09149

citations

#4544

Grounding 3D Object Affordance with Language Instructions, Visual Observations and Interactions

He Zhu, Quyu Kong, Kechun Xu et al.

CVPR 2025arXiv:2504.04744

citations

#4545

Enabling Realtime Reinforcement Learning at Scale with Staggered Asynchronous Inference

Matt Riemer, Gopeshh Raaj Subbaraj, Glen Berseth et al.

ICLR 2025arXiv:2412.14355

citations

#4546

Circuit Representation Learning with Masked Gate Modeling and Verilog-AIG Alignment

Haoyuan Wu, Haisheng Zheng, Yuan Pu et al.

ICLR 2025arXiv:2502.12732

citations

#4547

Not All LLM-Generated Data Are Equal: Rethinking Data Weighting in Text Classification

Hsun-Yu Kuo, Yin-Hsiang Liao, Yu-Chieh Chao et al.

ICLR 2025arXiv:2410.21526

citations

#4548

Event-based Tiny Object Detection: A Benchmark Dataset and Baselines

Nuo Chen, Chao Xiao, Yimian Dai et al.

ICCV 2025arXiv:2506.23575

citations

#4549

GazeGaussian: High-Fidelity Gaze Redirection with 3D Gaussian Splatting

Xiaobao Wei, Peng Chen, Guangyu Li et al.

ICCV 2025highlightarXiv:2411.12981

citations

#4550

Zeroth-Order Fine-Tuning of LLMs in Random Subspaces

Ziming Yu, Pan Zhou, Sike Wang et al.

ICCV 2025arXiv:2410.08989

citations

#4551

Understanding the Limits of Deep Tabular Methods with Temporal Shift

Haorun Cai, Han-Jia Ye

ICML 2025oralarXiv:2502.20260

citations

#4552

LICORICE: Label-Efficient Concept-Based Interpretable Reinforcement Learning

Zhuorui Ye, Stephanie Milani, Geoff Gordon et al.

ICLR 2025arXiv:2407.15786

citations

#4553

Learning quadratic neural networks in high dimensions: SGD dynamics and scaling laws

Gerard Ben Arous, Murat Erdogdu, Nuri Mert Vural et al.

NEURIPS 2025arXiv:2508.03688

citations

#4554

MultiverSeg: Scalable Interactive Segmentation of Biomedical Imaging Datasets with In-Context Guidance

Hallee Wong, Jose Javier Gonzalez Ortiz, John Guttag et al.

ICCV 2025arXiv:2412.15058

citations

#4555

L-MTP: Leap Multi-Token Prediction Beyond Adjacent Context for Large Language Models

Xiaohao Liu, Xiaobo Xia, Weixiang Zhao et al.

NEURIPS 2025arXiv:2505.17505

citations

#4556

Learning from Neighbors: Category Extrapolation for Long-Tail Learning

Shizhen Zhao, Xin Wen, Jiahui Liu et al.

CVPR 2025arXiv:2410.15980

citations

#4557

On Generalization Across Environments In Multi-Objective Reinforcement Learning

Jayden Teoh, Pradeep Varakantham, Peter Vamplew

ICLR 2025arXiv:2503.00799

citations

#4558

Feature Clipping for Uncertainty Calibration

Linwei Tao, Minjing Dong, Chang Xu

AAAI 2025paperarXiv:2410.19796

citations

#4559

Bringing CLIP to the Clinic: Dynamic Soft Labels and Negation-Aware Learning for Medical Analysis

Hanbin Ko, Chang Min Park

CVPR 2025arXiv:2505.22079

citations

#4560

nvBench 2.0: Resolving Ambiguity in Text-to-Visualization through Stepwise Reasoning

Tianqi Luo, Chuhan Huang, Leixian Shen et al.

NEURIPS 2025arXiv:2503.12880

citations

#4561

Audio-Visual Semantic Graph Network for Audio-Visual Event Localization

Liang Liu, Shuaiyong Li, Yongqiang Zhu

CVPR 2025

citations

#4562

E(3)-equivariant models cannot learn chirality: Field-based molecular generation

Alexandru Dumitrescu, Dani Korpela, Markus Heinonen et al.

ICLR 2025arXiv:2402.15864

citations

#4563

Many-Objective Multi-Solution Transport

Ziyue Li, Tian Li, Virginia Smith et al.

ICLR 2025arXiv:2403.04099

citations

#4564

Detect Any Mirrors: Boosting Learning Reliability on Large-Scale Unlabeled Data with an Iterative Data Engine

Zhaohu Xing, Lihao Liu, Yijun Yang et al.

CVPR 2025

citations

#4565

BitStack: Any-Size Compression of Large Language Models in Variable Memory Environments

Xinghao Wang, Pengyu Wang, Bo Wang et al.

ICLR 2025arXiv:2410.23918

citations

#4566

AniMo: Species-Aware Model for Text-Driven Animal Motion Generation

Xuan Wang, Kai Ruan, Xing Zhang et al.

CVPR 2025

citations

#4567

Noise-Resilient Symbolic Regression with Dynamic Gating Reinforcement Learning

Chenglu Sun, Shuo Shen, Wenzhi Tao et al.

AAAI 2025paperarXiv:2501.01085

citations

#4568

Q-Adapter: Customizing Pre-trained LLMs to New Preferences with Forgetting Mitigation

Yi-Chen Li, Fuxiang Zhang, Wenjie Qiu et al.

ICLR 2025arXiv:2407.03856

citations

#4569

SimVS: Simulating World Inconsistencies for Robust View Synthesis

Alex Trevithick, Roni Paiss, Philipp Henzler et al.

CVPR 2025arXiv:2412.07696

citations

#4570

Distilling Parallel Gradients for Fast ODE Solvers of Diffusion Models

Beier Zhu, Ruoyu Wang, Tong Zhao et al.

ICCV 2025arXiv:2507.14797

citations

#4571

Efficient Quadratic Corrections for Frank-Wolfe Algorithms

Jannis Halbey, Seta Rakotomandimby, Mathieu Besançon et al.

NEURIPS 2025arXiv:2506.02635

citations

#4572

From Probability to Counterfactuals: the Increasing Complexity of Satisfiability in Pearl's Causal Hierarchy

Julian Dörfler, Benito van der Zander, Markus Bläser et al.

ICLR 2025arXiv:2405.07373

citations

#4573

ResearchCodeBench: Benchmarking LLMs on Implementing Novel Machine Learning Research Code

Tianyu Hua, Harper Hua, Violet Xiang et al.

NEURIPS 2025spotlightarXiv:2506.02314

citations

#4574

Auditing Meta-Cognitive Hallucinations in Reasoning Large Language Models

Haolang Lu, Yilian Liu, Jingxin Xu et al.

NEURIPS 2025arXiv:2505.13143

citations

#4575

DLF: Extreme Image Compression with Dual-generative Latent Fusion

Naifu Xue, Zhaoyang Jia, Jiahao Li et al.

ICCV 2025highlightarXiv:2503.01428

citations

#4576

Difficulty-aware Balancing Margin Loss for Long-tailed Recognition

Minseok Son, Inyong Koo, Jinyoung Park et al.

AAAI 2025paperarXiv:2412.15477

citations

#4577

Point-to-Region Loss for Semi-Supervised Point-Based Crowd Counting

Wei Lin, Chenyang ZHAO, Antoni B. Chan

CVPR 2025highlightarXiv:2505.21943

citations

#4578

A Simple Approach to Unifying Diffusion-based Conditional Generation

Xirui Li, Charles Herrmann, Kelvin Chan et al.

ICLR 2025arXiv:2410.11439

citations

#4579

Multimodal Variational Autoencoder: A Barycentric View

Peijie Qiu, Wenhui Zhu, Sayantan Kumar et al.

AAAI 2025paperarXiv:2412.20487

citations

#4580

Learning-Guided Rolling Horizon Optimization for Long-Horizon Flexible Job-Shop Scheduling

Sirui Li, Wenbin Ouyang, Yining Ma et al.

ICLR 2025arXiv:2502.15791

citations

#4581

Rethinking Fair Representation Learning for Performance-Sensitive Tasks

Charles Jones, Fabio De Sousa Ribeiro, Mélanie Roschewitz et al.

ICLR 2025arXiv:2410.04120

citations

#4582

Learning to Communicate Through Implicit Communication Channels

Han Wang, Binbin Chen, zhang et al.

ICLR 2025arXiv:2411.01553

citations

#4583

Dense2MoE: Restructuring Diffusion Transformer to MoE for Efficient Text-to-Image Generation

Youwei Zheng, Yuxi Ren, Xin Xia et al.

ICCV 2025arXiv:2510.09094

citations

#4584

Harmonizing Visual and Textual Embeddings for Zero-Shot Text-to-Image Customization

Yeji Song, Jimyeong Kim, Wonhark Park et al.

AAAI 2025paperarXiv:2403.14155

citations

#4585

Specifying What You Know or Not for Multi-Label Class-Incremental Learning

Aoting Zhang, Dongbao Yang, Chang Liu et al.

AAAI 2025paperarXiv:2503.17017

citations

#4586

Accelerating Training with Neuron Interaction and Nowcasting Networks

Boris Knyazev, Abhinav Moudgil, Guillaume Lajoie et al.

ICLR 2025arXiv:2409.04434

citations

#4587

Benchmarking and Learning Multi-Dimensional Quality Evaluator for Text-to-3D Generation

Yujie Zhang, Bingyang Cui, Qi Yang et al.

ICCV 2025arXiv:2412.11170

citations

#4588

Direct Numerical Layout Generation for 3D Indoor Scene Synthesis via Spatial Reasoning

Xingjian Ran, Yixuan Li, Linning Xu et al.

NEURIPS 2025arXiv:2506.05341

citations

#4589

FedSPU: Personalized Federated Learning for Resource-Constrained Devices with Stochastic Parameter Update

Ziru Niu, Hai Dong, A. K. Qin

AAAI 2025paperarXiv:2403.11464

citations

#4590

Distillation Robustifies Unlearning

Bruce W, Lee, Addie Foote, Alex Infanger et al.

NEURIPS 2025spotlightarXiv:2506.06278

citations

#4591

FreeMorph: Tuning-Free Generalized Image Morphing with Diffusion Model

Yukang Cao, Chenyang Si, Jinghao Wang et al.

ICCV 2025arXiv:2507.01953

citations

#4592

Contrastive Test-Time Composition of Multiple LoRA Models for Image Generation

Tuna Meral, Enis Simsar, Federico Tombari et al.

ICCV 2025highlightarXiv:2403.19776

citations

#4593

SAM-CP: Marrying SAM with Composable Prompts for Versatile Segmentation

Pengfei Chen, Lingxi Xie, xinyue huo et al.

ICLR 2025arXiv:2407.16682

citations

#4594

GarmentPile: Point-Level Visual Affordance Guided Retrieval and Adaptation for Cluttered Garments Manipulation

Ruihai Wu, Ziyu Zhu, Yuran Wang et al.

CVPR 2025arXiv:2503.09243

citations

#4595

Scaling Physical Reasoning with the PHYSICS Dataset

Shenghe Zheng, Qianjia Cheng, Junchi Yao et al.

NEURIPS 2025arXiv:2506.00022

citations

#4596

Lay-Your-Scene: Natural Scene Layout Generation with Diffusion Transformers

Divyansh Srivastava, Xiang Zhang, He Wen et al.

ICCV 2025arXiv:2505.04718

citations

#4597

Shape My Moves: Text-Driven Shape-Aware Synthesis of Human Motions

Ting-Hsuan Liao, Yi Zhou, Yu Shen et al.

CVPR 2025arXiv:2504.03639

citations

#4598

What Matters in Data for DPO?

Yu Pan, Zhongze Cai, Huaiyang Zhong et al.

NEURIPS 2025arXiv:2508.18312

citations

#4599

We Should Chart an Atlas of All the World's Models

Eliahu Horwitz, Nitzan Kurer, Jonathan Kahana et al.

NEURIPS 2025arXiv:2503.10633

citations

#4600

Zero-shot 3D Question Answering via Voxel-based Dynamic Token Compression

Hsiang-Wei Huang, Fu-Chen Chen, Wenhao Chai et al.

CVPR 2025

citations

← Previous

1...21 22 23 24 25...112