Most Cited 2025 "temporal frame prediction" Papers

22,274 papers found • Page 15 of 112

Filters:Most Cited 2025 temporal frame prediction Clear all

Conference

AAAI 2025 (3,028)COLM 2025 (418)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NEURIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,558)oral (1,594)spotlight (1,421)highlight (975)

#2801

DreamDistribution: Learning Prompt Distribution for Diverse In-distribution Generation

Brian Nlong Zhao, Yuhang Xiao, Jiashu Xu et al.

ICLR 2025posterarXiv:2312.14216

citations

#2802

Realistic Evaluation of Deep Partial-Label Learning Algorithms

Wei Wang, Dong-Dong Wu, Jindong Wang et al.

ICLR 2025posterarXiv:2502.10184

citations

#2803

AlphaZero Neural Scaling and Zipf's Law: a Tale of Board Games and Power Laws

Oren Neumann, Claudius Gros

NEURIPS 2025spotlightarXiv:2412.11979

citations

#2804

SILMM: Self-Improving Large Multimodal Models for Compositional Text-to-Image Generation

Leigang Qu, Haochuan Li, Wenjie Wang et al.

CVPR 2025posterarXiv:2412.05818

citations

#2805

X-Fi: A Modality-Invariant Foundation Model for Multimodal Human Sensing

Xinyan Chen, Jianfei Yang

ICLR 2025posterarXiv:2410.10167

citations

#2806

Adversarially Robust Out-of-Distribution Detection Using Lyapunov-Stabilized Embeddings

Hossein Mirzaei Sadeghlou, Mackenzie Mathis

ICLR 2025poster

citations

#2807

On the Optimization and Generalization of Two-layer Transformers with Sign Gradient Descent

Bingrui Li, Wei Huang, Andi Han et al.

ICLR 2025posterarXiv:2410.04870

citations

#2808

Provable Convergence and Limitations of Geometric Tempering for Langevin Dynamics

Omar Chehab, Anna Korba, Austin Stromme et al.

ICLR 2025posterarXiv:2410.09697

citations

#2809

EventGPT: Event Stream Understanding with Multimodal Large Language Models

shaoyu liu, Jianing Li, guanghui zhao et al.

CVPR 2025posterarXiv:2412.00832

citations

#2810

SAMRefiner: Taming Segment Anything Model for Universal Mask Refinement

Yuqi Lin, Hengjia Li, Wenqi Shao et al.

ICLR 2025posterarXiv:2502.06756

citations

#2811

(Mis)Fitting Scaling Laws: A Survey of Scaling Law Fitting Techniques in Deep Learning

Margaret Li, Sneha Kudugunta, Luke Zettlemoyer

ICLR 2025poster

citations

#2812

PassionSR: Post-Training Quantization with Adaptive Scale in One-Step Diffusion based Image Super-Resolution

Zhu Li Bo, Jianze Li, Haotong Qin et al.

CVPR 2025posterarXiv:2411.17106

citations

#2813

Edge Prompt Tuning for Graph Neural Networks

Xingbo Fu, Yinhan He, Jundong Li

ICLR 2025posterarXiv:2503.00750

citations

#2814

SMARTIES: Spectrum-Aware Multi-Sensor Auto-Encoder for Remote Sensing Images

Gencer Sumbul, Chang Xu, Emanuele Dalsasso et al.

ICCV 2025posterarXiv:2506.19585

citations

#2815

SOLAMI: Social Vision-Language-Action Modeling for Immersive Interaction with 3D Autonomous Characters

Jianping Jiang, Weiye Xiao, Zhengyu Lin et al.

CVPR 2025posterarXiv:2412.00174

citations

#2816

GLASS: Guided Latent Slot Diffusion for Object-Centric Learning

Krishnakant Singh, Simone Schaub-Meyer, Stefan Roth

CVPR 2025posterarXiv:2407.17929

citations

#2817

On Reasoning Strength Planning in Large Reasoning Models

Leheng Sheng, An Zhang, Zijian Wu et al.

NEURIPS 2025posterarXiv:2506.08390

citations

#2818

Fast Summation of Radial Kernels via QMC Slicing

Johannes Hertrich, Tim Jahn, Michael Quellmalz

ICLR 2025posterarXiv:2410.01316

citations

#2819

Distilling Monocular Foundation Model for Fine-grained Depth Completion

Yingping Liang, Yutao Hu, Wenqi Shao et al.

CVPR 2025posterarXiv:2503.16970

citations

#2820

MAP: Unleashing Hybrid Mamba-Transformer Vision Backbone's Potential with Masked Autoregressive Pretraining

Yunze Liu, Li Yi

CVPR 2025posterarXiv:2410.00871

citations

#2821

Monet: Mixture of Monosemantic Experts for Transformers

Jungwoo Park, Young Jin Ahn, Kee-Eung Kim et al.

ICLR 2025posterarXiv:2412.04139

citations

#2822

I Can Hear You: Selective Robust Training for Deepfake Audio Detection

Zirui Zhang, Wei Hao, Aroon Sankoh et al.

ICLR 2025posterarXiv:2411.00121

citations

#2823

CoA: Towards Real Image Dehazing via Compression-and-Adaptation

Long Ma, Yuxin Feng, Yan Zhang et al.

CVPR 2025posterarXiv:2504.05590

citations

#2824

Random-Set Neural Networks

Shireen Kudukkil Manchingal, Muhammad Mubashar, Kaizheng Wang et al.

ICLR 2025posterarXiv:2307.05772

citations

#2825

Visual-Instructed Degradation Diffusion for All-in-One Image Restoration

Haina Qin, Wenyang Luo, Zewen Chen et al.

CVPR 2025posterarXiv:2506.16960

citations

#2826

Accurate and Regret-Aware Numerical Problem Solver for Tabular Question Answering

Yuxiang Wang, Jianzhong Qi, Junhao Gan

AAAI 2025paperarXiv:2410.12846

citations

#2827

OLinear: A Linear Model for Time Series Forecasting in Orthogonally Transformed Domain

Wenzhen Yue, Yong Liu, Hao Wang et al.

NEURIPS 2025oralarXiv:2505.08550

citations

#2828

Coarse Correspondences Boost Spatial-Temporal Reasoning in Multimodal Language Model

Benlin Liu, Yuhao Dong, Yiqin Wang et al.

CVPR 2025posterarXiv:2408.00754

citations

#2829

Do Large Language Models Truly Understand Geometric Structures?

Xiaofeng Wang, Yiming Wang, Wenhong Zhu et al.

ICLR 2025posterarXiv:2501.13773

citations

#2830

An End-to-End Robust Point Cloud Semantic Segmentation Network with Single-Step Conditional Diffusion Models

Wentao Qu, Jing Wang, Yongshun Gong et al.

CVPR 2025posterarXiv:2411.16308

citations

#2831

Bootstrapping Language-Guided Navigation Learning with Self-Refining Data Flywheel

Zun Wang, Jialu Li, Yicong Hong et al.

ICLR 2025posterarXiv:2412.08467

citations

#2832

DreamText: High Fidelity Scene Text Synthesis

Yibin Wang, Weizhong Zhang, honghui xu et al.

CVPR 2025posterarXiv:2405.14701

citations

#2833

Denoising as Adaptation: Noise-Space Domain Adaptation for Image Restoration

Kang Liao, Zongsheng Yue, Zhouxia Wang et al.

ICLR 2025posterarXiv:2406.18516

citations

#2834

Omni-RGPT: Unifying Image and Video Region-level Understanding via Token Marks

Miran Heo, Min-Hung Chen, De-An Huang et al.

CVPR 2025posterarXiv:2501.08326

citations

#2835

Rethinking Query-based Transformer for Continual Image Segmentation

Yuchen Zhu, Cheng Shi, Dingyou Wang et al.

CVPR 2025posterarXiv:2507.07831

citations

#2836

GTR: Improving Large 3D Reconstruction Models through Geometry and Texture Refinement

Peiye Zhuang, Songfang Han, Chaoyang Wang et al.

ICLR 2025posterarXiv:2406.05649

citations

#2837

Beyond the convexity assumption: Realistic tabular data generation under quantifier-free real linear constraints

Mihaela Stoian, Eleonora Giunchiglia

ICLR 2025posterarXiv:2502.18237

citations

#2838

BountyBench: Dollar Impact of AI Agent Attackers and Defenders on Real-World Cybersecurity Systems

Andy Zhang, Joey Ji, Celeste Menders et al.

NEURIPS 2025posterarXiv:2505.15216

citations

#2839

Fourier Sliced-Wasserstein Embedding for Multisets and Measures

Tal Amir, Nadav Dym

ICLR 2025posterarXiv:2405.16519

citations

#2840

Radiology Report Generation via Multi-objective Preference Optimization

Ting Xiao, Lei Shi, Peng Liu et al.

AAAI 2025paperarXiv:2412.08901

citations

#2841

MM-Agent: LLM as Agents for Real-world Mathematical Modeling Problem

Fan LIU, Zherui Yang, Cancheng Liu et al.

NEURIPS 2025posterarXiv:2505.14148

citations

#2842

Context-Parametric Inversion: Why Instruction Finetuning May Not Actually Improve Context Reliance

Sachin Goyal, Christina Baek, Zico Kolter et al.

ICLR 2025poster

citations

#2843

Shortcuts and Identifiability in Concept-based Models from a Neuro-Symbolic Lens

Samuele Bortolotti, Emanuele Marconato, Paolo Morettin et al.

NEURIPS 2025posterarXiv:2502.11245

citations

#2844

Attention as a Hypernetwork

Simon Schug, Seijin Kobayashi, Yassir Akram et al.

ICLR 2025posterarXiv:2406.05816

citations

#2845

ADBA: Approximation Decision Boundary Approach for Black-Box Adversarial Attacks

Feiyang Wang, Xingquan Zuo, Hai Huang et al.

AAAI 2025paper

citations

#2846

A Watermark for Order-Agnostic Language Models

Ruibo Chen, Yihan Wu, Yanshuo Chen et al.

ICLR 2025posterarXiv:2410.13805

citations

#2847

SuperMat: Physically Consistent PBR Material Estimation at Interactive Rates

Yijia Hong, Yuan-Chen Guo, Ran Yi et al.

ICCV 2025posterarXiv:2411.17515

citations

#2848

FisherTune: Fisher-Guided Robust Tuning of Vision Foundation Models for Domain Generalized Segmentation

Dong Zhao, Jinlong Li, Shuang Wang et al.

CVPR 2025posterarXiv:2503.17940

citations

#2849

Chain of Attack: On the Robustness of Vision-Language Models Against Transfer-Based Adversarial Attacks

Peng Xie, Yequan Bie, Jianda Mao et al.

CVPR 2025posterarXiv:2411.15720

citations

#2850

GenHancer: Imperfect Generative Models are Secretly Strong Vision-Centric Enhancers

Shijie Ma, Yuying Ge, Teng Wang et al.

ICCV 2025posterarXiv:2503.19480

citations

#2851

Zero-Shot Styled Text Image Generation, but Make It Autoregressive

Vittorio Pippi, Fabio Quattrini, Silvia Cascianelli et al.

CVPR 2025posterarXiv:2503.17074

citations

#2852

SceneTAP: Scene-Coherent Typographic Adversarial Planner against Vision-Language Models in Real-World Environments

Yue Cao, Yun Xing, Jie Zhang et al.

CVPR 2025posterarXiv:2412.00114

citations

#2853

RepLDM: Reprogramming Pretrained Latent Diffusion Models for High-Quality, High-Efficiency, High-Resolution Image Generation

Boyuan Cao, Jiaxin Ye, Yujie Wei et al.

NEURIPS 2025spotlightarXiv:2410.06055

citations

#2854

ConVis: Contrastive Decoding with Hallucination Visualization for Mitigating Hallucinations in Multimodal Large Language Models

Yeji Park, Deokyeong Lee, Junsuk Choe et al.

AAAI 2025paperarXiv:2408.13906

citations

#2855

Enhancing Time Series Forecasting through Selective Representation Spaces: A Patch Perspective

Xingjian Wu, Xiangfei Qiu, Hanyin Cheng et al.

NEURIPS 2025posterarXiv:2510.14510

citations

#2856

UnCommon Objects in 3D

Xingchen Liu, Piyush Tayal, Jianyuan Wang et al.

CVPR 2025posterarXiv:2501.07574

citations

#2857

Hierarchical Mixture of Experts: Generalizable Learning for High-Level Synthesis

Weikai Li, Ding Wang, Zijian Ding et al.

AAAI 2025paperarXiv:2410.19225

citations

#2858

JarvisArt: Liberating Human Artistic Creativity via an Intelligent Photo Retouching Agent

Yunlong Lin, Zixu Lin, Kunjie Lin et al.

NEURIPS 2025posterarXiv:2506.17612

citations

#2859

Mol-LLaMA: Towards General Understanding of Molecules in Large Molecular Language Model

Dongki Kim, Wonbin Lee, Sung Ju Hwang

NEURIPS 2025posterarXiv:2502.13449

citations

#2860

From Poses to Identity: Training-Free Person Re-Identification via Feature Centralization

Chao Yuan, Guiwei Zhang, Changxiao Ma et al.

CVPR 2025posterarXiv:2503.00938

citations

#2861

NeRAF: 3D Scene Infused Neural Radiance and Acoustic Fields

Amandine Brunetto, Sascha Hornauer, Fabien Moutarde

ICLR 2025posterarXiv:2405.18213

citations

#2862

MultiGO: Towards Multi-level Geometry Learning for Monocular 3D Textured Human Reconstruction

Gangjian Zhang, Nanjie Yao, Shunsi Zhang et al.

CVPR 2025posterarXiv:2412.03103

citations

#2863

Probability Density Geodesics in Image Diffusion Latent Space

Qingtao Yu, Jaskirat Singh, Zhaoyuan Yang et al.

CVPR 2025posterarXiv:2504.06675

citations

#2864

DriveEditor: A Unified 3D Information-Guided Framework for Controllable Object Editing in Driving Scenes

Yiyuan Liang, Zhiying Yan, Liqun Chen et al.

AAAI 2025paperarXiv:2412.19458

citations

#2865

Video-ColBERT: Contextualized Late Interaction for Text-to-Video Retrieval

Arun Reddy, Alexander Martin, Eugene Yang et al.

CVPR 2025posterarXiv:2503.19009

citations

#2866

EmergentTTS-Eval: Evaluating TTS Models on Complex Prosodic, Expressiveness, and Linguistic Challenges Using Model-as-a-Judge

Ruskin Raj Manku, Yuzhi Tang, Xingjian Shi et al.

NEURIPS 2025posterarXiv:2505.23009

citations

#2867

MUST: The First Dataset and Unified Framework for Multispectral UAV Single Object Tracking

Haolin Qin, Tingfa Xu, Tianhao Li et al.

CVPR 2025posterarXiv:2503.17699

citations

#2868

DIFFER: Disentangling Identity Features via Semantic Cues for Clothes-Changing Person Re-ID

Xin Liang, Yogesh S. Rawat

CVPR 2025posterarXiv:2503.22912

citations

#2869

PreciseCam: Precise Camera Control for Text-to-Image Generation

Edurne Bernal-Berdun, Ana Serrano, Belen Masia et al.

CVPR 2025posterarXiv:2501.12910

citations

#2870

VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by Video Spatiotemporal Augmentation

Weiming Ren, Huan Yang, Jie Min et al.

CVPR 2025posterarXiv:2412.00927

citations

#2871

Counterfactual Generative Modeling with Variational Causal Inference

Yulun Wu, Louis McConnell, Claudia Iriondo

ICLR 2025posterarXiv:2410.12730

citations

#2872

LoRA Subtraction for Drift-Resistant Space in Exemplar-Free Continual Learning

Xuan Liu, Xiaobin Chang

CVPR 2025posterarXiv:2503.18985

citations

#2873

ZoRI: Towards Discriminative Zero-Shot Remote Sensing Instance Segmentation

Shiqi Huang, Shuting He, Bihan Wen

AAAI 2025paperarXiv:2412.12798

citations

#2874

R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing

Tianyu Fu, Yi Ge, Yichen You et al.

NEURIPS 2025posterarXiv:2505.21600

citations

#2875

LiDAR-RT: Gaussian-based Ray Tracing for Dynamic LiDAR Re-simulation

Chenxu Zhou, Lvchang Fu, Sida Peng et al.

CVPR 2025posterarXiv:2412.15199

citations

#2876

GUARDIAN: Safeguarding LLM Multi-Agent Collaborations with Temporal Graph Modeling

Jialong Zhou, Lichao Wang, Xiao Yang

NEURIPS 2025oralarXiv:2505.19234

citations

#2877

GIM: A Million-scale Benchmark for Generative Image Manipulation Detection and Localization

Yirui Chen, Xudong Huang, Quan Zhang et al.

AAAI 2025paperarXiv:2406.16531

citations

#2878

DepthCues: Evaluating Monocular Depth Perception in Large Vision Models

Duolikun Danier, Mehmet Aygun, Changjian Li et al.

CVPR 2025posterarXiv:2411.17385

citations

#2879

PMQ-VE: Progressive Multi-Frame Quantization for Video Enhancement

ZhanFeng Feng, Long Peng, Xin Di et al.

NEURIPS 2025oralarXiv:2505.12266

citations

#2880

Markov Persuasion Processes: Learning to Persuade From Scratch

Francesco Bacchiocchi, Francesco Emanuele Stradi, Matteo Castiglioni et al.

NEURIPS 2025posterarXiv:2402.03077

citations

#2881

Imagine and Seek: Improving Composed Image Retrieval with an Imagined Proxy

You Li, Fan Ma, Yi Yang

CVPR 2025posterarXiv:2411.16752

citations

#2882

Quality-Driven Curation of Remote Sensing Vision-Language Data via Learned Scoring Models

Dilxat Muhtar, Enzhuo Zhang, Zhenshi Li et al.

NEURIPS 2025posterarXiv:2503.00743

citations

#2883

MP-SfM: Monocular Surface Priors for Robust Structure-from-Motion

Zador Pataki, Paul-Edouard Sarlin, Johannes Schönberger et al.

CVPR 2025posterarXiv:2504.20040

citations

#2884

GIFT: Unlocking Full Potential of Labels in Distilled Dataset at Near-zero Cost

Xinyi Shang, Peng Sun, Tao Lin

ICLR 2025posterarXiv:2405.14736

citations

#2885

Bayesian Test-Time Adaptation for Vision-Language Models

Lihua Zhou, Mao Ye, Shuaifeng Li et al.

CVPR 2025posterarXiv:2503.09248

citations

#2886

HyperGS: Hyperspectral 3D Gaussian Splatting

Christopher Thirgood, Oscar Mendez, Erin Chao Ling et al.

CVPR 2025posterarXiv:2412.12849

citations

#2887

Empowering Users in Digital Privacy Management through Interactive LLM-Based Agents

Bolun Sun, Yifan Zhou, Haiyun Jiang

ICLR 2025posterarXiv:2410.11906

citations

#2888

Theoretically Grounded Framework for LLM Watermarking: A Distribution-Adaptive Approach

Haiyun He, Yepeng Liu, Ziqiao Wang et al.

NEURIPS 2025posterarXiv:2410.02890

citations

#2889

Universal Video Temporal Grounding with Generative Multi-modal Large Language Models

Zeqian Li, Shangzhe Di, Zhonghua Zhai et al.

NEURIPS 2025oralarXiv:2506.18883

citations

#2890

Aux-Think: Exploring Reasoning Strategies for Data-Efficient Vision-Language Navigation

Shuo Wang, Yongcai Wang, Wanting Li et al.

NEURIPS 2025posterarXiv:2505.11886

citations

#2891

ScribbleLight: Single Image Indoor Relighting with Scribbles

Jun Myeong Choi, Annie N. Wang, Pieter Peers et al.

CVPR 2025posterarXiv:2411.17696

citations

#2892

KeyFace: Expressive Audio-Driven Facial Animation for Long Sequences via KeyFrame Interpolation

Antoni Bigata Casademunt, Michał Stypułkowski, Rodrigo Mira et al.

CVPR 2025posterarXiv:2503.01715

citations

#2893

Constrained Fair and Efficient Allocations

Benjamin Cookson, Soroush Ebadian, Nisarg Shah

AAAI 2025paperarXiv:2411.00133

citations

#2894

ViBiDSampler: Enhancing Video Interpolation Using Bidirectional Diffusion Sampler

Serin Yang, Taesung Kwon, Jong Chul YE

ICLR 2025oralarXiv:2410.05651

citations

#2895

HiMoR: Monocular Deformable Gaussian Reconstruction with Hierarchical Motion Representation

Yiming Liang, Tianhan Xu, Yuta Kikuchi

CVPR 2025posterarXiv:2504.06210

citations

#2896

TraF-Align: Trajectory-aware Feature Alignment for Asynchronous Multi-agent Perception

Zhiying Song, Lei Yang, Fuxi Wen et al.

CVPR 2025posterarXiv:2503.19391

citations

#2897

Strategy Coopetition Explains the Emergence and Transience of In-Context Learning

Aaditya Singh, Ted Moskovitz, Sara Dragutinović et al.

ICML 2025oralarXiv:2503.05631

citations

#2898

Deep Nonlinear Sufficient Dimension Reduction

Yinfeng Chen, Yuling Jiao, Rui Qiu et al.

NEURIPS 2025poster

citations

#2899

AgentTTS: Large Language Model Agent for Test-time Compute-optimal Scaling Strategy in Complex Tasks

Fali Wang, Hui Liu, Zhenwei Dai et al.

NEURIPS 2025posterarXiv:2508.00890

citations

#2900

Update Your Transformer to the Latest Release: Re-Basin of Task Vectors

Filippo Rinaldi, Giacomo Capitani, Lorenzo Bonicelli et al.

ICML 2025posterarXiv:2505.22697

citations

#2901

$\gamma-$MoD: Exploring Mixture-of-Depth Adaptation for Multimodal Large Language Models

Yaxin Luo, Gen Luo, Jiayi Ji et al.

ICLR 2025poster

citations

#2902

Controllable Generation via Locally Constrained Resampling

Kareem Ahmed, Kai-Wei Chang, Guy Van den Broeck

ICLR 2025posterarXiv:2410.13111

citations

#2903

On Temperature Scaling and Conformal Prediction of Deep Classifiers

Lahav Dabah, Tom Tirer

ICML 2025posterarXiv:2402.05806

citations

#2904

WeatherGFM: Learning a Weather Generalist Foundation Model via In-context Learning

Xiangyu Zhao, Zhiwang Zhou, Wenlong Zhang et al.

ICLR 2025oralarXiv:2411.05420

citations

#2905

RSafe: Incentivizing proactive reasoning to build robust and adaptive LLM safeguards

jingnan zheng, Xiangtian Ji, Yijun Lu et al.

NEURIPS 2025posterarXiv:2506.07736

citations

#2906

Robot-R1: Reinforcement Learning for Enhanced Embodied Reasoning in Robotics

Dongyoung Kim, Huiwon Jang, Sumin Park et al.

NEURIPS 2025posterarXiv:2506.00070

citations

#2907

PlanarSplatting: Accurate Planar Surface Reconstruction in 3 Minutes

Bin Tan, Rui Yu, Yujun Shen et al.

CVPR 2025highlightarXiv:2412.03451

citations

#2908

Circumventing Shortcuts in Audio-visual Deepfake Detection Datasets with Unsupervised Learning

Stefan Smeu, Dragos-Alexandru Boldisor, Dan Oneata et al.

CVPR 2025highlightarXiv:2412.00175

citations

#2909

MagCache: Fast Video Generation with Magnitude-Aware Cache

Zehong Ma, Longhui Wei, Feng Wang et al.

NEURIPS 2025posterarXiv:2506.09045

citations

#2910

ChartMuseum: Testing Visual Reasoning Capabilities of Large Vision-Language Models

Liyan Tang, Grace Kim, Xinyu Zhao et al.

NEURIPS 2025posterarXiv:2505.13444

citations

#2911

Efficient Alternating Minimization with Applications to Weighted Low Rank Approximation

Zhao Song, Mingquan Ye, Junze Yin et al.

ICLR 2025posterarXiv:2306.04169

citations

#2912

BASKET: A Large-Scale Video Dataset for Fine-Grained Skill Estimation

Yulu Pan, Ce Zhang, Gedas Bertasius

CVPR 2025posterarXiv:2503.20781

citations

#2913

DOVE: Efficient One-Step Diffusion Model for Real-World Video Super-Resolution

Zheng Chen, Zichen Zou, Kewei Zhang et al.

NEURIPS 2025posterarXiv:2505.16239

citations

#2914

Online Video Understanding: OVBench and VideoChat-Online

Zhenpeng Huang, Xinhao Li, Jiaqi Li et al.

CVPR 2025posterarXiv:2501.00584

citations

#2915

Solving Video Inverse Problems Using Image Diffusion Models

Taesung Kwon, Jong Chul YE

ICLR 2025oralarXiv:2409.02574

citations

#2916

LITA-GS: Illumination-Agnostic Novel View Synthesis via Reference-Free 3D Gaussian Splatting and Physical Priors

Han Zhou, Wei Dong, Jun Chen

CVPR 2025posterarXiv:2504.00219

citations

#2917

Language Guided Concept Bottleneck Models for Interpretable Continual Learning

Lu Yu, HaoYu Han, Zhe Tao et al.

CVPR 2025posterarXiv:2503.23283

citations

#2918

All-in-One: Transferring Vision Foundation Models into Stereo Matching

Jingyi Zhou, Haoyu Zhang, Jiakang Yuan et al.

AAAI 2025paperarXiv:2412.09912

citations

#2919

Scaling Laws for Differentially Private Language Models

Ryan McKenna, Yangsibo Huang, Amer Sinha et al.

ICML 2025posterarXiv:2501.18914

citations

#2920

LesionLocator: Zero-Shot Universal Tumor Segmentation and Tracking in 3D Whole-Body Imaging

Maximilian Rokuss, Yannick Kirchhoff, Seval Akbal et al.

CVPR 2025posterarXiv:2502.20985

citations

#2921

QuaDiM: A Conditional Diffusion Model For Quantum State Property Estimation

Yehui Tang, Mabiao Long, Junchi Yan

ICLR 2025poster

citations

#2922

Distributive Fairness in Large Language Models: Evaluating Alignment with Human Values

Hadi Hosseini, Samarth Khanna

NEURIPS 2025posterarXiv:2502.00313

citations

#2923

ReVisionLLM: Recursive Vision-Language Model for Temporal Grounding in Hour-Long Videos

Tanveer Hannan, Md Mohaiminul Islam, Jindong Gu et al.

CVPR 2025posterarXiv:2411.14901

citations

#2924

MOCA: Self-supervised Representation Learning by Predicting Masked Online Codebook Assignments

MATTHIEU CORD, Antonin Vobecky, Oriane Siméoni et al.

ICLR 2025posterarXiv:2307.09361

citations

#2925

SuperPC: A Single Diffusion Model for Point Cloud Completion, Upsampling, Denoising, and Colorization

Yi Du, Zhipeng Zhao, Shaoshu Su et al.

CVPR 2025posterarXiv:2503.14558

citations

#2926

X-Dancer: Expressive Music to Human Dance Video Generation

Zeyuan Chen, Hongyi Xu, Guoxian Song et al.

ICCV 2025highlightarXiv:2502.17414

citations

#2927

HOPE for a Robust Parameterization of Long-memory State Space Models

Annan Yu, Michael W Mahoney, N. Benjamin Erichson

ICLR 2025posterarXiv:2405.13975

citations

#2928

Synthesizing Privacy-Preserving Text Data via Finetuning without Finetuning Billion-Scale LLMs

Bowen Tan, Zheng Xu, Eric Xing et al.

ICML 2025posterarXiv:2503.12347

citations

#2929

REGENT: A Retrieval-Augmented Generalist Agent That Can Act In-Context in New Environments

Kaustubh Sridhar, Souradeep Dutta, Dinesh Jayaraman et al.

ICLR 2025posterarXiv:2412.04759

citations

#2930

HO-Cap: A Capture System and Dataset for 3D Reconstruction and Pose Tracking of Hand-Object Interaction

Jikai Wang, Qifan Zhang, Yu-Wei Chao et al.

NEURIPS 2025posterarXiv:2406.06843

citations

#2931

Lumina-T2X: Scalable Flow-based Large Diffusion Transformer for Flexible Resolution Generation

Gao Peng, Le Zhuo, Dongyang Liu et al.

ICLR 2025oral

citations

#2932

DICE: End-to-end Deformation Capture of Hand-Face Interactions from a Single Image

Qingxuan Wu, Zhiyang Dou, Sirui Xu et al.

ICLR 2025posterarXiv:2406.17988

citations

#2933

Can Classic GNNs Be Strong Baselines for Graph-level Tasks? Simple Architectures Meet Excellence

Yuankai Luo, Lei Shi, Xiao-Ming Wu

ICML 2025posterarXiv:2502.09263

citations

#2934

Rapidly Adapting Policies to the Real-World via Simulation-Guided Fine-Tuning

Patrick Yin, Tyler Westenbroek, Ching-An Cheng et al.

ICLR 2025posterarXiv:2502.02705

citations

#2935

Knowledge-Aligned Counterfactual-Enhancement Diffusion Perception for Unsupervised Cross-Domain Visual Emotion Recognition

Wen Yin, Yong Wang, Guiduo Duan et al.

CVPR 2025posterarXiv:2505.19694

citations

#2936

Dual Prompting Image Restoration with Diffusion Transformers

Dehong Kong, Fan Li, Zhixin Wang et al.

CVPR 2025posterarXiv:2504.17825

citations

#2937

MegActor-Sigma: Unlocking Flexible Mixed-Modal Control in Portrait Animation with Diffusion Transformer

Shurong Yang, Huadong Li, Juhao Wu et al.

AAAI 2025paper

citations

#2938

When Do LLMs Help With Node Classification? A Comprehensive Analysis

Xixi Wu, Yifei Shen, Fangzhou Ge et al.

ICML 2025posterarXiv:2502.00829

citations

#2939

Solving Inequality Proofs with Large Language Models

Jiayi Sheng, Luna Lyu, Jikai Jin et al.

NEURIPS 2025spotlightarXiv:2506.07927

citations

#2940

Attention-Driven GUI Grounding: Leveraging Pretrained Multimodal Large Language Models Without Fine-Tuning

Hai-Ming Xu, Qi Chen, Lei Wang et al.

AAAI 2025paperarXiv:2412.10840

citations

#2941

Enhancing Foundation Models for Time Series Forecasting via Wavelet-based Tokenization

Luca Masserano, Abdul Fatir Ansari, Boran Han et al.

ICML 2025oralarXiv:2412.05244

citations

#2942

Cafe-Talk: Generating 3D Talking Face Animation with Multimodal Coarse- and Fine-grained Control

Hejia Chen, Haoxian Zhang, Shoulong Zhang et al.

ICLR 2025oralarXiv:2503.14517

citations

#2943

ICM-Assistant: Instruction-tuning Multimodal Large Language Models for Rule-based Explainable Image Content Moderation

Mengyang Wu, Yuzhi Zhao, Jialun Cao et al.

AAAI 2025paperarXiv:2412.18216

citations

#2944

Generalizing from SIMPLE to HARD Visual Reasoning: Can We Mitigate Modality Imbalance in VLMs?

Simon Park, Abhishek Panigrahi, Yun Cheng et al.

ICML 2025posterarXiv:2501.02669

citations

#2945

Breaking Free from MMI: A New Frontier in Rationalization by Probing Input Utilization

Wei Liu, Zhiying Deng, Zhongyu Niu et al.

ICLR 2025posterarXiv:2503.06202

citations

#2946

Decomposition Polyhedra of Piecewise Linear Functions

Marie-Charlotte Brandenburg, Moritz Grillo, Christoph Hertrich

ICLR 2025posterarXiv:2410.04907

citations

#2947

Physical Plausibility-aware Trajectory Prediction via Locomotion Embodiment

Hiromu Taketsugu, Takeru Oba, Takahiro Maeda et al.

CVPR 2025posterarXiv:2503.17267

citations

#2948

PDE-Transformer: Efficient and Versatile Transformers for Physics Simulations

Benjamin Holzschuh, Qiang Liu, Georg Kohl et al.

ICML 2025oralarXiv:2505.24717

citations

#2949

HybridGS: High-Efficiency Gaussian Splatting Data Compression using Dual-Channel Sparse Representation and Point Cloud Encoder

Qi Yang, Le Yang, Geert Van der Auwera et al.

ICML 2025posterarXiv:2505.01938

citations

#2950

Multimodal Latent Diffusion Model for Complex Sewing Pattern Generation

Shengqi Liu, Yuhao Cheng, Zhuo Chen et al.

ICCV 2025posterarXiv:2412.14453

citations

#2951

Diffusion Tree Sampling: Scalable inference‑time alignment of diffusion models

Vineet Jain, Kusha Sareen, Mohammad Pedramfar et al.

NEURIPS 2025posterarXiv:2506.20701

citations

#2952

Scaling Collapse Reveals Universal Dynamics in Compute-Optimally Trained Neural Networks

Shikai Qiu, Lechao Xiao, Andrew Wilson et al.

ICML 2025oralarXiv:2507.02119

citations

#2953

Objective drives the consistency of representational similarity across datasets

Laure Ciernik, Lorenz Linhardt, Marco Morik et al.

ICML 2025posterarXiv:2411.05561

citations

#2954

ROADWork: A Dataset and Benchmark for Learning to Recognize, Observe, Analyze and Drive Through Work Zones

Anurag Ghosh, Shen Zheng, Robert Tamburo et al.

ICCV 2025posterarXiv:2406.07661

citations

#2955

Beyond Bradley-Terry Models: A General Preference Model for Language Model Alignment

Yifan Zhang, Ge Zhang, Yue Wu et al.

ICML 2025posterarXiv:2410.02197

citations

#2956

SILO: Solving Inverse Problems with Latent Operators

Ron Raphaeli, Sean Man, Michael Elad

ICCV 2025posterarXiv:2501.11746

citations

#2957

StreamMind: Unlocking Full Frame Rate Streaming Video Dialogue through Event-Gated Cognition

Xin Ding, Hao Wu, Yifan Yang et al.

ICCV 2025posterarXiv:2503.06220

citations

#2958

DM-Adapter: Domain-Aware Mixture-of-Adapters for Text-Based Person Retrieval

Yating Liu, Zimo Liu, Xiangyuan Lan et al.

AAAI 2025paperarXiv:2503.04144

citations

#2959

Diffusion Transformers as Open-World Spatiotemporal Foundation Models

Yuan Yuan, Chonghua Han, Jingtao Ding et al.

NEURIPS 2025oralarXiv:2411.12164

citations

#2960

OS-ATLAS: Foundation Action Model for Generalist GUI Agents

Zhiyong Wu, Zhenyu Wu, Fangzhi Xu et al.

ICLR 2025poster

citations

#2961

Perspective-Invariant 3D Object Detection

Alan Liang, Lingdong Kong, Dongyue Lu et al.

ICCV 2025posterarXiv:2507.17665

citations

#2962

MP-GUI: Modality Perception with MLLMs for GUI Understanding

Ziwei Wang, Weizhi Chen, Leyang Yang et al.

CVPR 2025posterarXiv:2503.14021

citations

#2963

UAVScenes: A Multi-Modal Dataset for UAVs

Sijie Wang, Siqi Li, Yawei Zhang et al.

ICCV 2025posterarXiv:2507.22412

citations

#2964

How Transformers Learn Structured Data: Insights From Hierarchical Filtering

Jerome Garnier-Brun, Marc Mezard, Emanuele Moscato et al.

ICML 2025posterarXiv:2408.15138

citations

#2965

Make Me Happier: Evoking Emotions Through Image Diffusion Models

Qing Lin, Jingfeng Zhang, YEW-SOON ONG et al.

ICCV 2025posterarXiv:2403.08255

citations

#2966

SurFhead: Affine Rig Blending for Geometrically Accurate 2D Gaussian Surfel Head Avatars

Jaeseong Lee, Taewoong Kang, Marcel Buehler et al.

ICLR 2025posterarXiv:2410.11682

citations

#2967

Can Transformers Do Enumerative Geometry?

Baran Hashemi, Roderic Corominas, Alessandro Giacchetto

ICLR 2025posterarXiv:2408.14915

citations

#2968

Diffusion models for Gaussian distributions: Exact solutions and Wasserstein errors

Emile Pierret, Bruno Galerne

ICML 2025posterarXiv:2405.14250

citations

#2969

BrainUICL: An Unsupervised Individual Continual Learning Framework for EEG Applications

Yangxuan Zhou, Sha Zhao, Jiquan Wang et al.

ICLR 2025poster

citations

#2970

Generalized and Efficient 2D Gaussian Splatting for Arbitrary-scale Super-Resolution

Du Chen, Liyi Chen, Zhengqiang ZHANG et al.

ICCV 2025posterarXiv:2501.06838

citations

#2971

MUDDFormer: Breaking Residual Bottlenecks in Transformers via Multiway Dynamic Dense Connections

Da Xiao, Qingye Meng, Shengping Li et al.

ICML 2025posterarXiv:2502.12170

citations

#2972

Discrete Codebook World Models for Continuous Control

Aidan Scannell, Mohammadreza Nakhaeinezhadfard, Kalle Kujanpää et al.

ICLR 2025posterarXiv:2503.00653

citations

#2973

MINERVA: Evaluating Complex Video Reasoning

Arsha Nagrani, Sachit Menon, Ahmet Iscen et al.

ICCV 2025posterarXiv:2505.00681

citations

#2974

DiffCalib: Reformulating Monocular Camera Calibration as Diffusion-Based Dense Incident Map Generation

Xiankang He, Guangkai Xu, Bo Zhang et al.

AAAI 2025paperarXiv:2405.15619

citations

#2975

Towards Federated RLHF with Aggregated Client Preference for LLMs

Feijie Wu, Xiaoze Liu, Haoyu Wang et al.

ICLR 2025posterarXiv:2407.03038

citations

#2976

MultiADS: Defect-aware Supervision for Multi-type Anomaly Detection and Segmentation in Zero-Shot Learning

Ylli Sadikaj, Hongkuan Zhou, Lavdim Halilaj et al.

ICCV 2025posterarXiv:2504.06740

citations

#2977

ReQFlow: Rectified Quaternion Flow for Efficient and High-Quality Protein Backbone Generation

Angxiao Yue, Zichong Wang, Hongteng Xu

ICML 2025posterarXiv:2502.14637

citations

#2978

MIB: A Mechanistic Interpretability Benchmark

Aaron Mueller, Atticus Geiger, Sarah Wiegreffe et al.

ICML 2025posterarXiv:2504.13151

citations

#2979

Learning to Adapt Frozen CLIP for Few-Shot Test-Time Domain Adaptation

Zhixiang Chi, Li Gu, Huan Liu et al.

ICLR 2025posterarXiv:2506.17307

citations

#2980

Graph Generative Pre-trained Transformer

Xiaohui Chen, Yinkai Wang, JIAXING HE et al.

ICML 2025posterarXiv:2501.01073

citations

#2981

Diff3DS: Generating View-Consistent 3D Sketch via Differentiable Curve Rendering

Yibo Zhang, Lihong Wang, Changqing Zou et al.

ICLR 2025posterarXiv:2405.15305

citations

#2982

Advancing Expert Specialization for Better MoE

Hongcan Guo, Haolang Lu, Guoshun Nan et al.

NEURIPS 2025oralarXiv:2505.22323

citations

#2983

Multi-Pair Temporal Sentence Grounding via Multi-Thread Knowledge Transfer Network

Xiang Fang, Wanlong Fang, Changshuo Wang et al.

AAAI 2025paperarXiv:2412.15678

citations

#2984

Jailbreaking as a Reward Misspecification Problem

Zhihui Xie, Jiahui Gao, Lei Li et al.

ICLR 2025posterarXiv:2406.14393

citations

#2985

LeFusion: Controllable Pathology Synthesis via Lesion-Focused Diffusion Models

Hantao Zhang, Yuhe Liu, Jiancheng Yang et al.

ICLR 2025posterarXiv:2403.14066

citations

#2986

Accelerating LLM Inference with Lossless Speculative Decoding Algorithms for Heterogeneous Vocabularies

Nadav Timor, Jonathan Mamou, Daniel Korat et al.

ICML 2025oralarXiv:2502.05202

citations

#2987

Gap-Dependent Bounds for Q-Learning using Reference-Advantage Decomposition

Zhong Zheng, Haochen Zhang, Lingzhou Xue

ICLR 2025posterarXiv:2410.07574

citations

#2988

Wide Neural Networks Trained with Weight Decay Provably Exhibit Neural Collapse

Arthur Jacot, Peter Súkeník, Zihan Wang et al.

ICLR 2025posterarXiv:2410.04887

citations

#2989

PENCIL: Long Thoughts with Short Memory

Chenxiao Yang, Nati Srebro, David McAllester et al.

ICML 2025posterarXiv:2503.14337

citations

#2990

Repo2Run: Automated Building Executable Environment for Code Repository at Scale

Ruida Hu, Chao Peng, XinchenWang et al.

NEURIPS 2025spotlightarXiv:2502.13681

citations

#2991

Reasoning as an Adaptive Defense for Safety

Taeyoun Kim, Fahim Tajwar, Aditi Raghunathan et al.

NEURIPS 2025posterarXiv:2507.00971

citations

#2992

HELM: Hierarchical Encoding for mRNA Language Modeling

Mehdi Yazdani-Jahromi, Mangal Prakash, Tommaso Mansi et al.

ICLR 2025posterarXiv:2410.12459

citations

#2993

BodyGen: Advancing Towards Efficient Embodiment Co-Design

Haofei Lu, Zhe Wu, Junliang Xing et al.

ICLR 2025oralarXiv:2503.00533

citations

#2994

InPO: Inversion Preference Optimization with Reparametrized DDIM for Efficient Diffusion Model Alignment

Yunhong Lu, Qichao Wang, Hengyuan Cao et al.

CVPR 2025highlightarXiv:2503.18454

citations

#2995

EAP-GP: Mitigating Saturation Effect in Gradient-based Automated Circuit Identification

Lin Zhang, Wenshuo Dong, Zhuoran Zhang et al.

NEURIPS 2025posterarXiv:2502.06852

citations

#2996

Rethinking Evaluation of Sparse Autoencoders through the Representation of Polysemous Words

Gouki Gouki, Hiroki Furuta, Yusuke Iwasawa et al.

ICLR 2025posterarXiv:2501.06254

citations

#2997

CAD-Coder: Text-to-CAD Generation with Chain-of-Thought and Geometric Reward

Yandong Guan, Xilin Wang, XiMing Xing et al.

NEURIPS 2025posterarXiv:2505.19713

citations

#2998

On the Training Convergence of Transformers for In-Context Classification of Gaussian Mixtures

Wei Shen, Ruida Zhou, Jing Yang et al.

ICML 2025posterarXiv:2410.11778

citations

#2999

Zeroth-Order Policy Gradient for Reinforcement Learning from Human Feedback without Reward Inference

Qining Zhang, Lei Ying

ICLR 2025posterarXiv:2409.17401

citations

#3000

The Non-Linear Representation Dilemma: Is Causal Abstraction Enough for Mechanistic Interpretability?

Denis Sutter, Julian Minder, Thomas Hofmann et al.

NEURIPS 2025spotlightarXiv:2507.08802

citations

← Previous

1...13 14 15 16 17...112