Energy-Based Models
EBMs and contrastive divergence training
Top Papers
VBench: Comprehensive Benchmark Suite for Video Generative Models
Ziqi Huang, Yinan He, Jiashuo Yu et al.
WorldSimBench: Towards Video Generation Models as World Simulators
Yiran Qin, Zhelun Shi, Jiwen Yu et al.
From Crowdsourced Data to High-quality Benchmarks: Arena-Hard and Benchbuilder Pipeline
Tianle Li, Wei-Lin Chiang, Evan Frick et al.
Think before you speak: Training Language Models With Pause Tokens
Sachin Goyal, Ziwei Ji, Ankit Singh Rawat et al.
StableVITON: Learning Semantic Correspondence with Latent Diffusion Model for Virtual Try-On
Jeongho Kim, Gyojung Gu, Minho Park et al.
WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild
Bill Yuchen Lin, Yuntian Deng, Khyathi Chandu et al.
ToolACE: Winning the Points of LLM Function Calling
Weiwen Liu, Xu Huang, Xingshan Zeng et al.
Unpaired Image-to-Image Translation via Neural Schrödinger Bridge
Beomsu Kim, Gihyun Kwon, Kwanyoung Kim et al.
AxBench: Steering LLMs? Even Simple Baselines Outperform Sparse Autoencoders
Zhengxuan Wu, Aryaman Arora, Atticus Geiger et al.
An Empirical Study of CLIP for Text-Based Person Search
Cao Min, Yang Bai, ziyin Zeng et al.
DEIM: DETR with Improved Matching for Fast Convergence
Shihua Huang, Zhichao Lu, Xiaodong Cun et al.
DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation
Yuang Peng, Yuxin Cui, Haomiao Tang et al.
Towards Open-ended Visual Quality Comparison
Haoning Wu, Hanwei Zhu, Zicheng Zhang et al.
When Attention Sink Emerges in Language Models: An Empirical View
Xiangming Gu, Tianyu Pang, Chao Du et al.
Decoupled Contrastive Multi-View Clustering with High-Order Random Walks
Yiding Lu, Yijie Lin, Mouxing Yang et al.
Reliable Conflictive Multi-View Learning
Cai Xu, Jiajun Si, Ziyu Guan et al.
EgoExoLearn: A Dataset for Bridging Asynchronous Ego- and Exo-centric View of Procedural Activities in Real World
Yifei Huang, Guo Chen, Jilan Xu et al.
Consistency Models Made Easy
Zhengyang Geng, Ashwini Pokle, Weijian Luo et al.
A Benchmark for Learning to Translate a New Language from One Grammar Book
Garrett Tanzer, Mirac Suzgun, Eline Visser et al.
CCEdit: Creative and Controllable Video Editing via Diffusion Models
Ruoyu Feng, Wenming Weng, Yanhui Wang et al.
MMTEB: Massive Multilingual Text Embedding Benchmark
Kenneth Enevoldsen, Isaac Chung, Imene Kerboua et al.
Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal Sampling
Hritik Bansal, Arian Hosseini, Rishabh Agarwal et al.
Boosting Consistency in Story Visualization with Rich-Contextual Conditional Diffusion Models
Fei Shen, Hu Ye, Sibo Liu et al.
Discover-then-Name: Task-Agnostic Concept Bottlenecks via Automated Concept Discovery
Sukrut Rao, Sweta Mahajan, Moritz Böhle et al.
BEND: Benchmarking DNA Language Models on Biologically Meaningful Tasks
Frederikke Marin, Felix Teufel, Marc Horlacher et al.
LatestEval: Addressing Data Contamination in Language Model Evaluation through Dynamic and Time
Sensitive Test Construction - Yucheng Li, Frank Guerin, Chenghua Lin
SAEBench: A Comprehensive Benchmark for Sparse Autoencoders in Language Model Interpretability
Adam Karvonen, Can Rager, Johnny Lin et al.
Energy-Based Diffusion Language Models for Text Generation
Minkai Xu, Tomas Geffner, Karsten Kreis et al.
SocialCounterfactuals: Probing and Mitigating Intersectional Social Biases in Vision-Language Models with Counterfactual Examples
Phillip Howard, Avinash Madasu, Tiep Le et al.
EraseAnything: Enabling Concept Erasure in Rectified Flow Transformers
Daiheng Gao, Shilin Lu, Wenbo Zhou et al.
FaithEval: Can Your Language Model Stay Faithful to Context, Even If "The Moon is Made of Marshmallows"
Yifei Ming, Senthil Purushwalkam, Shrey Pandit et al.
RRM: Robust Reward Model Training Mitigates Reward Hacking
Tianqi Liu, Wei Xiong, Jie Ren et al.
EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
Kai Chen, Yunhao Gou, Runhui Huang et al.
Debiasing Multimodal Sarcasm Detection with Contrastive Learning
Mengzhao Jia, Can Xie, Liqiang Jing
Emotional Speech-driven 3D Body Animation via Disentangled Latent Diffusion
Kiran Chhatre, Radek Danecek, Nikos Athanasiou et al.
CC-OCR: A Comprehensive and Challenging OCR Benchmark for Evaluating Large Multimodal Models in Literacy
Zhibo Yang, Jun Tang, Zhaohai Li et al.
Accelerated Sampling from Masked Diffusion Models via Entropy Bounded Unmasking
Heli Ben-Hamu, Itai Gat, Daniel Severo et al.
A Simple Recipe for Contrastively Pre-training Video-First Encoders Beyond 16 Frames
Pinelopi Papalampidi, Skanda Koppula, Shreya Pathak et al.
HSEvo: Elevating Automatic Heuristic Design with Diversity-Driven Harmony Search and Genetic Algorithm Using LLMs
Pham Vu Tuan Dat, Long Doan, Huynh Thi Thanh Binh
Enhancing Visual Document Understanding with Contrastive Learning in Large Visual-Language Models
Xin Li, Yunfei Wu, Xinghua Jiang et al.
ElasticDiffusion: Training-free Arbitrary Size Image Generation through Global-Local Content Separation
Moayed Haji Ali, Guha Balakrishnan, Vicente Ordonez
Variational Best-of-N Alignment
Afra Amini, Tim Vieira, Elliott Ash et al.
MBR and QE Finetuning: Training-time Distillation of the Best and Most Expensive Decoding Methods
Mara Finkelstein, Markus Freitag
Amodal Completion via Progressive Mixed Context Diffusion
Katherine Xu, Lingzhi Zhang, Jianbo Shi
Prompting Hard or Hardly Prompting: Prompt Inversion for Text-to-Image Diffusion Models
Shweta Mahajan, Tanzila Rahman, Kwang Moo Yi et al.
Spurious Feature Diversification Improves Out-of-distribution Generalization
LIN Yong, Lu Tan, Yifan HAO et al.
Attentive Eraser: Unleashing Diffusion Model’s Object Removal Potential via Self-Attention Redirection Guidance
Wenhao Sun, Xue-Mei Dong, Benlei Cui et al.
Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation
Yiming Wang, Pei Zhang, Baosong Yang et al.
REEF: Representation Encoding Fingerprints for Large Language Models
Jie Zhang, Dongrui Liu, Chen Qian et al.
Contrasting Deepfakes Diffusion via Contrastive Learning and Global-Local Similarities
Lorenzo Baraldi, Federico Cocchi, Marcella Cornia et al.
Emotion Rendering for Conversational Speech Synthesis with Heterogeneous Graph-Based Context Modeling
Rui Liu, Yifan Hu, Yi Ren et al.
The dark side of the forces: assessing non-conservative force models for atomistic machine learning
Filippo Bigi, Marcel Langer, Michele Ceriotti
UMIE: Unified Multimodal Information Extraction with Instruction Tuning
Lin Sun, Kai Zhang, Qingyuan Li et al.
DyFo: A Training-Free Dynamic Focus Visual Search for Enhancing LMMs in Fine-Grained Visual Understanding
Geng Li, Jinglin Xu, Yunzhen Zhao et al.
Contextrast: Contextual Contrastive Learning for Semantic Segmentation
Changki Sung, Wanhee Kim, Jungho An et al.
Archetypal SAE: Adaptive and Stable Dictionary Learning for Concept Extraction in Large Vision Models
Thomas Fel, Ekdeep Singh Lubana, Jacob Prince et al.
Towards Robust Alignment of Language Models: Distributionally Robustifying Direct Preference Optimization
Junkang Wu, Yuexiang Xie, Zhengyi Yang et al.
Energy-guided Entropic Neural Optimal Transport
Petr Mokrov, Alexander Korotin, Alexander Kolesov et al.
EventBind: Learning a Unified Representation to Bind Them All for Event-based Open-world Understanding
jiazhou zhou, Xu Zheng, Yuanhuiyi Lyu et al.
Style Outweighs Substance: Failure Modes of LLM Judges in Alignment Benchmarking
Benjamin Feuer, Micah Goldblum, Teresa Datta et al.
Text-Based Occluded Person Re-identification via Multi-Granularity Contrastive Consistency Learning
Xinyi Wu, Wentao Ma, Dan Guo et al.
Enhancing Diffusion Models with Text-Encoder Reinforcement Learning
Chaofeng Chen, Annan Wang, Haoning Wu et al.
UMBRAE: Unified Multimodal Brain Decoding
Weihao Xia, Raoul de Charette, Cengiz Oztireli et al.
Improved baselines for vision-language pre-training
Jakob Verbeek, Enrico Fini, Michal Drozdzal et al.
Efficient Inference of Vision Instruction-Following Models with Elastic Cache
ZUYAN LIU, Benlin Liu, Jiahui Wang et al.
SCD-Net: Spatiotemporal Clues Disentanglement Network for Self-Supervised Skeleton-Based Action Recognition
Cong Wu, Xiao-Jun Wu, Josef Kittler et al.
Specialized Foundation Models Struggle to Beat Supervised Baselines
Zongzhe Xu, Ritvik Gupta, Wenduo Cheng et al.
HELMET: How to Evaluate Long-context Models Effectively and Thoroughly
Howard Yen, Tianyu Gao, Minmin Hou et al.
Facial Affective Behavior Analysis with Instruction Tuning
Yifan Li, Anh Dao, Wentao Bao et al.
Reward Guided Latent Consistency Distillation
William Wang, Jiachen Li, Weixi Feng et al.
The AdEMAMix Optimizer: Better, Faster, Older
Matteo Pagliardini, Pierre Ablin, David Grangier
Semantics-Adaptive Activation Intervention for LLMs via Dynamic Steering Vectors
Weixuan Wang, JINGYUAN YANG, Wei Peng
$\text{D}_{2}\text{O}$: Dynamic Discriminative Operations for Efficient Long-Context Inference of Large Language Models
Zhongwei Wan, Xinjian Wu, Yu Zhang et al.
DIM: Dyadic Interaction Modeling for Social Behavior Generation
Minh Tran, Di Chang, Maksim Siniukov et al.
Improving Medical Multi-modal Contrastive Learning with Expert Annotations
Yogesh Kumar, Pekka Marttinen
A Multi-Modal Contrastive Diffusion Model for Therapeutic Peptide Generation
Yongkang Wang, Xuan Liu, Feng Huang et al.
Contrastive Tuning: A Little Help to Make Masked Autoencoders Forget
Johannes Lehner, Benedikt Alkin, Andreas Fürst et al.
Improving Semantic Understanding in Speech Language Models via Brain-tuning
Omer Moussa, Dietrich Klakow, Mariya Toneva
Principled Data Selection for Alignment: The Hidden Risks of Difficult Examples
chengqian gao, Haonan Li, Liu Liu et al.
ConR: Contrastive Regularizer for Deep Imbalanced Regression
Mahsa Keramati, Lili Meng, R. Evans
Generating and Reweighting Dense Contrastive Patterns for Unsupervised Anomaly Detection
Songmin Dai, Yifan Wu, Xiaoqiang Li et al.
Investigating Non-Transitivity in LLM-as-a-Judge
Yi Xu, Laura Ruis, Tim Rocktäschel et al.
Customizing Language Model Responses with Contrastive In-Context Learning
Xiang Gao, Kamalika Das
Progress or Regress? Self-Improvement Reversal in Post-training
Ting Wu, Xuefeng Li, Pengfei Liu
Beyond Attention or Similarity: Maximizing Conditional Diversity for Token Pruning in MLLMs
Qizhe Zhang, Mengzhen Liu, Lichen Li et al.
A New Mechanism for Eliminating Implicit Conflict in Graph Contrastive Learning
Dongxiao He, Jitao Zhao, Cuiying Huo et al.
MagDiff: Multi-Alignment Diffusion for High-Fidelity Video Generation and Editing
Haoyu Zhao, Tianyi Lu, Jiaxi Gu et al.
Implicit Concept Removal of Diffusion Models
Zhili LIU, Kai Chen, Yifan Zhang et al.
A Dual-Way Enhanced Framework from Text Matching Point of View for Multimodal Entity Linking
Shezheng Song, Shan Zhao, ChengYu Wang et al.
EVEv2: Improved Baselines for Encoder-Free Vision-Language Models
Haiwen Diao, Xiaotong Li, Yufeng Cui et al.
Explain via Any Concept: Concept Bottleneck Model with Open Vocabulary Concepts
Andong Tan, Fengtao Zhou, Hao Chen
BioCLIP 2: Emergent Properties from Scaling Hierarchical Contrastive Learning
Jianyang Gu, Sam Stevens, Elizabeth Campolongo et al.
BaCon: Boosting Imbalanced Semi-supervised Learning via Balanced Feature-Level Contrastive Learning
Qianhan Feng, Lujing Xie, Shijie Fang et al.
DaWin: Training-free Dynamic Weight Interpolation for Robust Adaptation
Changdae Oh, Yixuan Li, Kyungwoo Song et al.
RocketEval: Efficient automated LLM evaluation via grading checklist
Tianjun Wei, Wei Wen, Ruizhi Qiao et al.
ET-SEED: EFFICIENT TRAJECTORY-LEVEL SE(3) EQUIVARIANT DIFFUSION POLICY
Chenrui Tie, Yue Chen, Ruihai Wu et al.
MrT5: Dynamic Token Merging for Efficient Byte-level Language Models
Julie Kallini, Shikhar Murty, Christopher Manning et al.
Weak-to-Strong Preference Optimization: Stealing Reward from Weak Aligned Model
Wenhong Zhu, Zhiwei He, Xiaofeng Wang et al.
Explore In-Context Segmentation via Latent Diffusion Models
Chaoyang Wang, Xiangtai Li, Henghui Ding et al.
Mix Data or Merge Models? Balancing the Helpfulness, Honesty, and Harmlessness of Large Language Model via Model Merging
Jinluan Yang, Dingnan Jin, Anke Tang et al.