Most Cited 2025 Poster Papers
22,274 papers found • Page 33 of 112
Conference
GSBA$^K$: $top$-$K$ Geometric Score-based Black-box Attack
Md Farhamdur Reza, Richeng Jin, Tianfu Wu et al.
DreamUHD: Frequency Enhanced Variational Autoencoder for Ultra-High-Definition Image Restoration
Yidi Liu, Dong Li, Jie Xiao et al.
Improving Neural Optimal Transport via Displacement Interpolation
Jaemoo Choi, Yongxin Chen, Jaewoong Choi
Neural Control and Certificate Repair via Runtime Monitoring
Emily Yu, Đorđe Žikelić, Thomas A. Henzinger
Cut out and Replay: A Simple yet Versatile Strategy for Multi-Label Online Continual Learning
Xinrui Wang, Shao-Yuan Li, Jiaqiang Zhang et al.
Medical Manifestation-Aware De-Identification
Yuan Tian, Shuo Wang, Guangtao Zhai
Neural Entropy
Akhil Premkumar
Learning Gaussian DAG Models without Condition Number Bounds
Constantinos Daskalakis, Vardis Kandiros, Rui Yao
ClimaQA: An Automated Evaluation Framework for Climate Question Answering Models
Veeramakali Vignesh Manivannan, Yasaman Jafari, Srikar Eranky et al.
DP-MemArc: Differential Privacy Transfer Learning for Memory Efficient Language Models
Yanming Liu, Xinyue Peng, Yuwei Zhang et al.
Exact Algorithms and Lower Bounds for Forming Coalitions of Constrained Maximum Size
Foivos Fioravantes, Harmender Gahlawat, Nikolaos Melissinos
MaxCutPool: differentiable feature-aware Maxcut for pooling in graph neural networks
Carlo Abate, Filippo Maria Bianchi
PBECount: Prompt-Before-Extract Paradigm for Class-Agnostic Counting
Canchen Yang, Tianyu Geng, Jian Peng et al.
Fair Division with Social Impact
Michele Flammini, Gianluigi Greco, Giovanna Varricchio
Enhancing Close-up Novel View Synthesis via Pseudo-labeling
Jiatong Xia, Libo Sun, Lingqiao Liu
IMPACT: Iterative Mask-based Parallel Decoding for Text-to-Audio Generation with Diffusion Modeling
Kuan Po Huang, Shu-wen Yang, Huy Phan et al.
Transfer Learning of Real Image Features with Soft Contrastive Loss for Fake Image Detection
Ziyou Liang, Weifeng Liu, Run Wang et al.
Faster Approximation Algorithms for k-Center via Data Reduction
Arnold Filtser, Shaofeng Jiang, Yi Li et al.
Beyond Modality Collapse: Representation Blending for Multimodal Dataset Distillation
xin zhang, Ziruo Zhang, JIAWEI DU et al.
Advancing Audio-Based Text Generation with Imbalance Preference Optimization
Zhenghao Zhou, Yongjie Liu, Chen Cao
WardropNet: Traffic Flow Predictions via Equilibrium-Augmented Learning
Kai Jungel, Dario Paccagnan, Axel Parmentier et al.
On the Power of Strategic Corpus Enrichment in Content Creation Games
Haya Nachimovsky, Moshe Tennenholtz
Residual Diffusion Deblurring Model for Single Image Defocus Deblurring
Haoxuan Feng, Haohui Zhou, Tian Ye et al.
SIM: Surface-based fMRI Analysis for Inter-Subject Multimodal Decoding from Movie-Watching Experiments
Simon Dahan, Gabriel Bénédict, Logan Williams et al.
Qinco2: Vector Compression and Search with Improved Implicit Neural Codebooks
Théophane Vallaeys, Matthew J Muckley, Jakob Verbeek et al.
FIG: Flow with Interpolant Guidance for Linear Inverse Problems
Yici Yan, Yichi Zhang, XIANGMING MENG et al.
PROSAC: Provably Safe Certification for Machine Learning Models under Adversarial Attacks
Chen Feng, Ziquan Liu, Zhuo Zhi et al.
Improved Regret Bounds for Online Fair Division with Bandit Learning
Benjamin Schiffer, Shirley Zhang
PaCA: Partial Connection Adaptation for Efficient Fine-Tuning
Sunghyeon Woo, Sol Namkung, SunWoo Lee et al.
Gradient-Guided Credit Assignment and Joint Optimization for Dependency-Aware Spatial Crowdsourcing
Yafei Li, Wei Chen, Jinxing Yan et al.
Collaborative Mean Estimation Among Heterogeneous Strategic Agents: Individual Rationality, Fairness, and Truthful Contribution
Alex Clinton, Yiding Chen, Jerry Zhu et al.
Learning Dynamic Similarity by Bidirectional Hierarchical Sliding Semantic Probe for Efficient Text Video Retrieval
Yang Liu, Shudong Huang, Deng Xiong et al.
Generalized Debiased Semi-Supervised Hashing for Large-Scale Image Retrieval
Xingbo Liu, Xuening Zhang, Xiushan Nie et al.
MEPNet: Medical Entity-Balanced Prompting Network for Brain CT Report Generation
Xiaodan Zhang, Yanzhao Shi, Junzhong Ji et al.
ALBAR: Adversarial Learning approach to mitigate Biases in Action Recognition
Joseph Fioresi, Ishan Rajendrakumar Dave, Mubarak Shah
Complexity Lower Bounds of Adaptive Gradient Algorithms for Non-convex Stochastic Optimization under Relaxed Smoothness
Michael Crawshaw, Mingrui Liu
Feature-Based Online Bilateral Trade
Solenne Gaucher, Martino Bernasconi, Matteo Castiglioni et al.
Correcting Large Language Model Behavior via Influence Function
Han Zhang, Zhuo Zhang, Yi Zhang et al.
From Models to Microtheories: Distilling a Model's Topical Knowledge for Grounded Question-Answering
Nathaniel Weir, Bhavana Dalvi Mishra, Orion Weller et al.
Improving Natural Language Understanding for LLMs via Large-Scale Instruction Synthesis
Lin Yuan, Jun Xu, Honghao Gui et al.
SSLFusion: Scale and Space Aligned Latent Fusion Model for Multimodal 3D Object Detection
Bonan Ding, Jin Xie, Jing Nie et al.
HG-Adapter: Improving Pre-Trained Heterogeneous Graph Neural Networks with Dual Adapters
YUJIE MO, Runpeng Yu, Xiaofeng Zhu et al.
Learning Shape-Independent Transformation via Spherical Representations for Category-Level Object Pose Estimation
Huan Ren, Wenfei Yang, Xiang Liu et al.
Rethinking the generalization of drug target affinity prediction algorithms via similarity aware evaluation
Chenbin Zhang, Zhiqiang Hu, Jiang Chuchu et al.
Interacted Object Grounding in Spatio-Temporal Human-Object Interactions
Xiaoyang Liu, Boran Wen, Xinpeng Liu et al.
Doubly Optimal Policy Evaluation for Reinforcement Learning
Shuze Liu, Claire Chen, Shangtong Zhang
Towards Understanding Why FixMatch Generalizes Better Than Supervised Learning
Jingyang Li, Jiachun Pan, Vincent Tan et al.
AI-Generated Video Detection via Perceptual Straightening
Christian Internò, Robert Geirhos, Markus Olhofer et al.
Make Haste Slowly: A Theory of Emergent Structured Mixed Selectivity in Feature Learning ReLU Networks
Devon Jarvis, Richard Klein, Benjamin Rosman et al.
Exact Computation of Any-Order Shapley Interactions for Graph Neural Networks
Maximilian Muschalik, Fabian Fumagalli, Paolo Frazzetto et al.
Eigenspectrum Analysis of Neural Networks without Aspect Ratio Bias
Yuanzhe Hu, Kinshuk Goel, Vlad Killiakov et al.
Defending Against Sophisticated Poisoning Attacks with RL-based Aggregation in Federated Learning
Yujing Wang, Hainan Zhang, Sijia Wen et al.
Test-time Adaptation on Graphs via Adaptive Subgraph-based Selection and Regularized Prototypes
Ming Zhang, Qixin Zhang, Xiao Luo et al.
Neural Assembler: Learning to Generate Fine-Grained Robotic Assembly Instructions from Multi-View Images
Hongyu Yan, Yadong Mu
Verification Learning: Make Unsupervised Neuro-Symbolic System Feasible
Lin-Han Jia, Wen-Chao Hu, Jie-Jing Shao et al.
Polyrating: A Cost-Effective and Bias-Aware Rating System for LLM Evaluation
Jasper Dekoninck, Maximilian Baader, Martin Vechev
Agent Skill Acquisition for Large Language Models via CycleQD
So Kuroki, Taishi Nakamura, Takuya Akiba et al.
Design Considerations in Offline Preference-based RL
Alekh Agarwal, Christoph Dann, Teodor Vanislavov Marinov
Language Models Are Implicitly Continuous
Samuele Marro, Davide Evangelista, X. Huang et al.
Leveraging Driver Field-of-View for Multimodal Ego-Trajectory Prediction
M. Eren Akbiyik, Nedko Savov, Danda Pani Paudel et al.
Efficient Generative Modeling with Residual Vector Quantization-Based Tokens
Jaehyeon Kim, Taehong Moon, Keon Lee et al.
FakeDiffer: Distributional Disparity Learning on Differentiated Reconstruction for Face Forgery Detection
Bo Wang, Zhao Zhang, Suiyi Zhao et al.
Foundation Models Secretly Understand Neural Network Weights: Enhancing Hypernetwork Architectures with Foundation Models
Jeffrey Gu, Serena Yeung
Bridging Sign and Spoken Languages: Pseudo Gloss Generation for Sign Language Translation
Jianyuan Guo, Peike Li, Trevor Cohn
3DGS-Drag: Dragging Gaussians for Intuitive Point-Based 3D Editing
Jiahua Dong, Yu-Xiong Wang
NegMerge: Sign-Consensual Weight Merging for Machine Unlearning
Hyo Seo Kim, Dongyoon Han, Junsuk Choe
Multi-Grained Query-Guided Set Prediction Network for Grounded Multimodal Named Entity Recognition
Jielong Tang, Zhenxing Wang, ZiYang Gong et al.
BANGS: Game-theoretic Node Selection for Graph Self-Training
Fangxin Wang, Kay Liu, Sourav Medya et al.
Learning Color Equivariant Representations
Yulong Yang, Felix O'Mahony, Christine Allen-Blanchette
CHAMP: Conformalized 3D Human Multi-Hypothesis Pose Estimators
Harry Zhang, Luca Carlone
DIDiffGes: Decoupled Semi-Implicit Diffusion Models for Real-time Gesture Generation from Speech
Yongkang Cheng, Shaoli Huang, Xuelin Chen et al.
Selective Unlearning via Representation Erasure Using Domain Adversarial Training
Nazanin Sepahvand, Eleni Triantafillou, Hugo Larochelle et al.
Sample-efficient Learning of Concepts with Theoretical Guarantees: from Data to Concepts without Interventions
Hidde Fokkema, Tim van Erven, Sara Magliacane
Predictive Uncertainty Quantification for Bird's Eye View Segmentation: A Benchmark and Novel Loss Function
Linlin Yu, Bowen Yang, Tianhao Wang et al.
econSG: Efficient and Multi-view Consistent Open-Vocabulary 3D Semantic Gaussians
Can Zhang, Gim H Lee
Adversarial-Inspired Backdoor Defense via Bridging Backdoor and Adversarial Attacks
Jia-Li Yin, Weijian Wang, Lyhwa et al.
SceneDecorator: Towards Scene-Oriented Story Generation with Scene Planning and Scene Consistency
Quanjian Song, Donghao Zhou, Jingyu Lin et al.
Neural Phylogeny: Fine-Tuning Relationship Detection among Neural Networks
Runpeng Yu, Xinchao Wang
DICE: Data Influence Cascade in Decentralized Learning
Tongtian Zhu, Wenhao Li, Can Wang et al.
Exploring The Loss Landscape Of Regularized Neural Networks Via Convex Duality
Sungyoon Kim, Aaron Mishkin, Mert Pilanci
Measuring Diversity: Axioms and Challenges
Mikhail Mironov, Liudmila Prokhorenkova
Uni$^2$Det: Unified and Universal Framework for Prompt-Guided Multi-dataset 3D Detection
Yubin Wang, Zhikang Zou, Xiaoqing Ye et al.
LoCA: Location-Aware Cosine Adaptation for Parameter-Efficient Fine-Tuning
Zhekai Du, Yinjie Min, Jingjing Li et al.
Unified Coding for Both Human Perception and Generalized Machine Analytics with CLIP Supervision
Kangsheng Yin, Quan Liu, Xuelin Shen et al.
FedLWS: Federated Learning with Adaptive Layer-wise Weight Shrinking
Changlong Shi, Jinmeng Li, He Zhao et al.
Isolated Causal Effects of Natural Language
Victoria Lin, Louis-Philippe Morency, Eli Ben-Michael
ObscuraCoder: Powering Efficient Code LM Pre-Training Via Obfuscation Grounding
Indraneil Paul, Haoyi Yang, Goran Glavaš et al.
Approximating Metric Magnitude of Point Sets
Rayna Andreeva, James Ward, Primoz Skraba et al.
WorldAPIs: The World Is Worth How Many APIs? A Thought Experiment
Jiefu Ou, Arda Uzunoğlu, Benjamin Van Durme et al.
Efficient Discovery of Pareto Front for Multi-Objective Reinforcement Learning
Ruohong Liu, Yuxin Pan, Linjie Xu et al.
TensorRL-QAS: Reinforcement learning with tensor networks for improved quantum architecture search
Akash Kundu, Stefano Mangini
Spatial Annealing for Efficient Few-shot Neural Rendering
Yuru Xiao, Deming Zhai, Wenbo Zhao et al.
Neural Combinatorial Clustered Bandits for Recommendation Systems
Baran Atalar, Carlee Joe-Wong
Enhancing Generalized Few-Shot Semantic Segmentation via Effective Knowledge Transfer
Xinyue Chen, Miaojing Shi, Zijian Zhou et al.
PortLLM: Personalizing Evolving Large Language Models with Training-Free and Portable Model Patches
Rana Muhammad Shahroz Khan, Pingzhi Li, Sukwon Yun et al.
When Every Millisecond Counts: Real-Time Anomaly Detection via the Multimodal Asynchronous Hybrid Network
Dong Xiao, Guangyao Chen, Peixi Peng et al.
Mufu: Multilingual Fused Learning for Low-Resource Translation with LLM
Zheng Wei Lim, Nitish Gupta, Honglin Yu et al.
DECT: Harnessing LLM-assisted Fine-Grained Linguistic Knowledge and Label-Switched and Label-Preserved Data Generation for Diagnosis of Alzheimer’s Disease
Tingyu Mo, Jacqueline C. K. Lam, Victor O. K. Li et al.
Towards a Unified Framework of Clustering-based Anomaly Detection
Zeyu Fang, Ming Gu, Sheng Zhou et al.
AniSDF: Fused-Granularity Neural Surfaces with Anisotropic Encoding for High-Fidelity 3D Reconstruction
Jingnan Gao, Zhuo Chen, Xiaokang Yang et al.
FIRM: Flexible Interactive Reflection ReMoval
Xiao Chen, Xudong Jiang, Yunkang Tao et al.
ReGen: Generative Robot Simulation via Inverse Design
Peter (Phat) Nguyen, Johnson (Tsun-Hsuan) Wang, Zhang-Wei Hong et al.
Holographic Node Representations: Pre-training Task-Agnostic Node Embeddings
Beatrice Bevilacqua, Joshua Robinson, Jure Leskovec et al.
Generalization Guarantees for Representation Learning via Data-Dependent Gaussian Mixture Priors
Milad Sefidgaran, Abdellatif Zaidi, Piotr Krasnowski
DreamAlign: Dynamic Text-to-3D Optimization with Human Preference Alignment
Gaofeng Liu, Zhiyuan Ma, Tao Fang
Bridging Jensen Gap for Max-Min Group Fairness Optimization in Recommendation
Chen Xu, Yuxin Li, Wenjie Wang et al.
Diffusion Transformers for Tabular Data Time Series Generation
Fabrizio Garuti, Enver Sangineto, Simone Luetto et al.
Wasserstein Policy Optimization
David Pfau, Ian Davies, Diana Borsa et al.
Demystifying Online Clustering of Bandits: Enhanced Exploration Under Stochastic and Smoothed Adversarial Contexts
Zhuohua Li, Maoli Liu, Xiangxiang Dai et al.
Semi-Supervised CLIP Adaptation by Enforcing Semantic and Trapezoidal Consistency
Kai Gan, Bo Ye, Min-Ling Zhang et al.
Accelerated Methods with Compressed Communications for Distributed Optimization Problems Under Data Similarity
Dmitry Bylinkin, Aleksandr Beznosikov
DialogDraw: Image Generation and Editing System Based on Multi-Turn Dialogue
Shichao Ma, Xinfeng Zhang, Zeng Zhao et al.
ML-GOOD: Towards Multi-Label Graph Out-Of-Distribution Detection
Tingyi Cai, Yunliang Jiang, Ming Li et al.
Weighted Point Set Embedding for Multimodal Contrastive Learning Toward Optimal Similarity Metric
Toshimitsu Uesaka, Taiji Suzuki, Yuhta Takida et al.
Re-Aligning Language to Visual Objects with an Agentic Workflow
Yuming Chen, Jiangyan Feng, Haodong Zhang et al.
Exact Certification of (Graph) Neural Networks Against Label Poisoning
Mahalakshmi Sabanayagam, Lukas Gosch, Stephan Günnemann et al.
Overcoming Multi-step Complexity in Multimodal Theory-of-Mind Reasoning: A Scalable Bayesian Planner
Chunhui Zhang, Zhongyu Ouyang, Kwonjoon Lee et al.
CP-DETR: Concept Prompt Guide DETR Toward Stronger Universal Object Detection
Qibo Chen, Weizhong Jin, Jianyue Ge et al.
Linear Bandits with Memory
Pierre Laforgue, Giulia Clerici, Nicolò Cesa-Bianchi
IPDN: Image-enhanced Prompt Decoding Network for 3D Referring Expression Segmentation
Qi Chen, Changli Wu, Jiayi Ji et al.
Rethinking Classifier Re-Training in Long-Tailed Recognition: Label Over-Smooth Can Balance
Siyu Sun, Han Lu, Jiangtong Li et al.
Graph World Model
Tao Feng, Yexin Wu, Guanyu Lin et al.
Relation Also Knows: Rethinking the Recall and Editing of Factual Associations in Auto-Regressive Transformer Language Models
Xiyu Liu, Zhengxiao Liu, Naibin Gu et al.
LiteReality: Graphic-Ready 3D Scene Reconstruction from RGB-D Scans
Zhening Huang, Xiaoyang Wu, Fangcheng Zhong et al.
VoteSplat: Hough Voting Gaussian Splatting for 3D Scene Understanding
Minchao Jiang, Shunyu Jia, Jiaming Gu et al.
Taming generative video models for zero-shot optical flow extraction
Seungwoo Kim, Khai Loong Aw, Klemen Kotar et al.
Gyro-based Neural Single Image Deblurring
Heemin Yang, Jaesung Rim, Seungyong Lee et al.
Efficient Long Video Tokenization via Coordinate-based Patch Reconstruction
Huiwon Jang, Sihyun Yu, Jinwoo Shin et al.
Kernel Density Steering: Inference-Time Scaling via Mode Seeking for Image Restoration
Yuyang Hu, Kangfu Mei, Mojtaba Ardakani et al.
CodeCrash: Exposing LLM Fragility to Misleading Natural Language in Code Reasoning
Man Ho Lam, Chaozheng Wang, Jen-Tse Huang et al.
Boosting Adversarial Transferability via Residual Perturbation Attack
Jinjia Peng, Zeze Tao, Huibing Wang et al.
LOD-GS: Achieving Levels of Detail using Scalable Gaussian Soup
Jianxiong Shen, Yue Qian, Xiaohang Zhan
GroomLight: Hybrid Inverse Rendering for Relightable Human Hair Appearance Modeling
Yang Zheng, Menglei Chai, Delio Vicini et al.
Reasoning in Visual Navigation of End-to-end Trained Agents: A Dynamical Systems Approach
Steeven JANNY, Hervé Poirier, Leonid Antsfeld et al.
Ground-V: Teaching VLMs to Ground Complex Instructions in Pixels
Yongshuo Zong, Qin ZHANG, DONGSHENG An et al.
EDCFlow: Exploring Temporally Dense Difference Maps for Event-based Optical Flow Estimation
Daikun Liu, Lei Cheng, Teng Wang et al.
Unleashing the Potential of Consistency Learning for Detecting and Grounding Multi-Modal Media Manipulation
Yiheng Li, Yang Yang, Zichang Tan et al.
On the Robustness of Transformers against Context Hijacking for Linear Classification
Tianle Li, Chenyang Zhang, Xingwu Chen et al.
GPO: Learning from Critical Steps to Improve LLM Reasoning
Jiahao Yu, Zelei Cheng, Xian Wu et al.
Polar Sparsity: High Throughput Batched LLM Inferencing with Scalable Contextual Sparsity
Susav Shrestha, Bradley Settlemyer, Nikoli Dryden et al.
Outcome-Based Online Reinforcement Learning: Algorithms and Fundamental Limits
Fan Chen, Zeyu Jia, Alexander Rakhlin et al.
$\Psi$-Sampler: Initial Particle Sampling for SMC-Based Inference-Time Reward Alignment in Score Models
Taehoon Yoon, Yunhong Min, Kyeongmin Yeo et al.
LLM-Driven Treatment Effect Estimation Under Inference Time Text Confounding
Yuchen Ma, Dennis Frauen, Jonas Schweisthal et al.
Latent Mixture of Symmetries for Sample-Efficient Dynamic Learning
Haoran Li, CHENHAN XIAO, Muhao Guo et al.
Parameterized Blur Kernel Prior Learning for Local Motion Deblurring
Zhenxuan Fang, Fangfang Wu, Tao Huang et al.
On the Coexistence and Ensembling of Watermarks
Aleksandar Petrov, Shruti Agarwal, Philip Torr et al.
RoPECraft: Training-Free Motion Transfer with Trajectory-Guided RoPE Optimization on Diffusion Transformers
Ahmet Berke Gökmen, Yiğit Ekin, Bahri Batuhan Bilecen et al.
HoGS: Unified Near and Far Object Reconstruction via Homogeneous Gaussian Splatting
Xinpeng Liu, Zeyi Huang, Fumio Okura et al.
Image Referenced Sketch Colorization Based on Animation Creation Workflow
Dingkun Yan, Xinrui Wang, Zhuoru Li et al.
TimeTracker: Event-based Continuous Point Tracking for Video Frame Interpolation with Non-linear Motion
Haoyue Liu, Jinghan Xu, Yi Chang et al.
Fast and Accurate Gigapixel Pathological Image Classification with Hierarchical Distillation Multi-Instance Learning
Jiuyang Dong, Junjun Jiang, Kui Jiang et al.
Synthesize Privacy-Preserving High-Resolution Images via Private Textual Intermediaries
Haoxiang Wang, Zinan Lin, Da Yu et al.
HalLoc: Token-level Localization of Hallucinations for Vision Language Models
Eunkyu Park, Minyeong Kim, Gunhee Kim
UNICL-SAM: Uncertainty-Driven In-Context Segmentation with Part Prototype Discovery
Dianmo Sheng, Dongdong Chen, Zhentao Tan et al.
Neurosymbolic Diffusion Models
Emile van Krieken, Pasquale Minervini, Edoardo Maria Ponti et al.
Antidote: A Unified Framework for Mitigating LVLM Hallucinations in Counterfactual Presupposition and Object Perception
Yuanchen Wu, Lu Zhang, Hang Yao et al.
ComRoPE: Scalable and Robust Rotary Position Embedding Parameterized by Trainable Commuting Angle Matrices
Hao Yu, Tangyu Jiang, Shuning Jia et al.
Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights
Zhiyuan Liang, Dongwen Tang, Yuhao Zhou et al.
Signs as Tokens: A Retrieval-Enhanced Multilingual Sign Language Generator
Ronglai Zuo, Rolandos Alexandros Potamias, Evangelos Ververas et al.
Systematic Reward Gap Optimization for Mitigating VLM Hallucinations
Lehan He, Zeren Chen, Zhelun Shi et al.
Probably Approximately Precision and Recall Learning
Lee Cohen, Yishay Mansour, Shay Moran et al.
4D-Fly: Fast 4D Reconstruction from a Single Monocular Video
Diankun Wu, Fangfu Liu, Yi-Hsin Hung et al.
Free-Lunch Color-Texture Disentanglement for Stylized Image Generation
Jiang Qin, Alexandra Gomez-Villa, Senmao Li et al.
Scaling Down Text Encoders of Text-to-Image Diffusion Models
Lifu Wang, Daqing Liu, Xinchen Liu et al.
SCOUT: Teaching Pre-trained Language Models to Enhance Reasoning via Flow Chain-of-Thought
Guanghao Li, Wenhao Jiang, Mingfeng Chen et al.
Wav2Sem: Plug-and-Play Audio Semantic Decoupling for 3D Speech-Driven Facial Animation
Hao Li, Ju Dai, Xin Zhao et al.
Simpler Diffusion: 1.5 FID on ImageNet512 with Pixel-space Diffusion
Emiel Hoogeboom, Thomas Mensink, Jonathan Heek et al.
From Image to Video: An Empirical Study of Diffusion Representations
Pedro Vélez, Luisa Polania Cabrera, Yi Yang et al.
Your Scale Factors are My Weapon: Targeted Bit-Flip Attacks on Vision Transformers via Scale Factor Manipulation
Jialai Wang, Yuxiao Wu, Weiye Xu et al.
Seeing in the Dark: Benchmarking Egocentric 3D Vision with the Oxford Day-and-Night Dataset
Zirui Wang, Wenjing Bian, Xinghui Li et al.
LLM-Explorer: A Plug-in Reinforcement Learning Policy Exploration Enhancement Driven by Large Language Models
Qianyue Hao, Yiwen Song, Qingmin Liao et al.
MEMOIR: Lifelong Model Editing with Minimal Overwrite and Informed Retention for LLMs
Ke Wang, Yiming QIN, Nikolaos Dimitriadis et al.
FFR: Frequency Feature Rectification for Weakly Supervised Semantic Segmentation
Ziqian Yang, Xinqiao Zhao, Xiaolei Wang et al.
Action Detail Matters: Refining Video Recognition with Local Action Queries
Mengmeng Wang, Zeyi Huang, Xiangjie Kong et al.
Nearly Zero-Cost Protection Against Mimicry by Personalized Diffusion Models
Namhyuk Ahn, KiYoon Yoo, Wonhyuk Ahn et al.
Silencer: From Discovery to Mitigation of Self-Bias in LLM-as-Benchmark-Generator
Peiwen Yuan, Yiwei Li, Shaoxiong Feng et al.
LARGO: Latent Adversarial Reflection through Gradient Optimization for Jailbreaking LLMs
Ran Li, Hao Wang, Chengzhi Mao
A Unified Framework for the Transportability of Population-Level Causal Measures
Ahmed Boughdiri, Clément Berenfeld, Julie Josse et al.
DFM: Differentiable Feature Matching for Anomaly Detection
Wu Sheng, Yimi Wang, Xudong Liu et al.
ExCap3D: Expressive 3D Scene Understanding via Object Captioning with Varying Detail
Chandan Yeshwanth, David Rozenberszki, Angela Dai
Unveiling the Compositional Ability Gap in Vision-Language Reasoning Model
Tianle Li, Jihai Zhang, Yongming Rao et al.
Scaling Laws for Robust Comparison of Open Foundation Language-Vision Models and Datasets
Marianna Nezhurina, Tomer Porian, Giovanni Puccetti et al.
Hierarchical Features Matter: A Deep Exploration of Progressive Parameterization Method for Dataset Distillation
Xinhao Zhong, Hao Fang, Bin Chen et al.
LogoSP: Local-global Grouping of Superpoints for Unsupervised Semantic Segmentation of 3D Point Clouds
Zihui Zhang, Weisheng Dai, Hongtao Wen et al.
PS-Diffusion: Photorealistic Subject-Driven Image Editing with Disentangled Control and Attention
Weicheng Wang, Guoli Jia, Zhongqi Zhang et al.
VideoVLA: Video Generators Can Be Generalizable Robot Manipulators
Yichao Shen, Fangyun Wei, Zhiying Du et al.
MLRC-Bench: Can Language Agents Solve Machine Learning Research Challenges?
Yunxiang Zhang, Muhammad Khalifa, Shitanshu Bhushan et al.
Towards Understanding and Quantifying Uncertainty for Text-to-Image Generation
Gianni Franchi, Nacim Belkhir, Dat NGUYEN et al.
ArcPro: Architectural Programs for Structured 3D Abstraction of Sparse Points
Qirui Huang, Runze Zhang, Kangjun Liu et al.
Learning long range dependencies through time reversal symmetry breaking
Guillaume Pourcel, Maxence Ernoult
RLZero: Direct Policy Inference from Language Without In-Domain Supervision
Harshit Sushil Sikchi, Siddhant Agarwal, Pranaya Jajoo et al.
Leveraging the Power of MLLMs for Gloss-Free Sign Language Translation
Jungeun Kim, Hyeongwoo Jeon, Jongseong Bae et al.
FeedEdit: Text-Based Image Editing with Dynamic Feedback Regulation
Fengyi Fu, Lei Zhang, Mengqi Huang et al.
VFRTok: Variable Frame Rates Video Tokenizer with Duration-Proportional Information Assumption
Tianxiong Zhong, Xingye Tian, Boyuan Jiang et al.
On the Generalization of Handwritten Text Recognition Models
Carlos Garrido-Munoz, Jorge Calvo-Zaragoza
LightFair: Towards an Efficient Alternative for Fair T2I Diffusion via Debiasing Pre-trained Text Encoders
Boyu Han, Qianqian Xu, Shilong Bao et al.
StickMotion: Generating 3D Human Motions by Drawing a Stickman
Tao Wang, Zhihua Wu, Qiaozhi He et al.
STEER-ME: Assessing the Microeconomic Reasoning of Large Language Models
Narun Raman, Taylor Lundy, Thiago Amin et al.
Transformers Provably Learn Chain-of-Thought Reasoning with Length Generalization
Yu Huang, Zixin Wen, Aarti Singh et al.
MVGBench: a Comprehensive Benchmark for Multi-view Generation Models
Xianghui Xie, Jan Lenssen, Gerard Pons-Moll