Most Cited 2024 "jacobian matrix analysis" Papers
12,324 papers found • Page 16 of 62
Conference
Frozen CLIP Transformer Is an Efficient Point Cloud Encoder
Xiaoshui Huang, Zhou Huang, Sheng Li et al.
Trackastra: Transformer-based cell tracking for live-cell microscopy
Benjamin Gallusser, Weigert Martin
LaneCPP: Continuous 3D Lane Detection using Physical Priors
Maximilian Pittner, Joel Janai, Alexandru Paul Condurache
DREAM: Dual Structured Exploration with Mixup for Open-set Graph Domain Adaption
Nan Yin, Mengzhu Wang, Mengzhu Wang et al.
Critic-Guided Decision Transformer for Offline Reinforcement Learning
Yuanfu Wang, Chao Yang, Ying Wen et al.
SphereHead: Stable 3D Full-head Synthesis with Spherical Tri-plane Representation
Heyuan Li, Ce Chen, Tianhao Shi et al.
DiffAssemble: A Unified Graph-Diffusion Model for 2D and 3D Reassembly
Gianluca Scarpellini, Stefano Fiorini, Francesco Giuliari et al.
Harmonizing Generalization and Personalization in Federated Prompt Learning
Tianyu Cui, Hongxia Li, Jingya Wang et al.
Inherently Interpretable Time Series Classification via Multiple Instance Learning
Joseph Early, Gavin Cheung, Kurt Cutajar et al.
Learning Domain-Independent Heuristics for Grounded and Lifted Planning
AccDiffusion: An Accurate Method for Higher-Resolution Image Generation
Zhihang Lin, Mingbao Lin, Meng Zhao et al.
Label Propagation for Zero-shot Classification with Vision-Language Models
Vladan Stojnić, Yannis Kalantidis, Giorgos Tolias
SeqGPT: An Out-of-the-Box Large Language Model for Open Domain Sequence Understanding
Tianyu Yu, Chengyue Jiang, Chao Lou et al.
Are Transformers with One Layer Self-Attention Using Low-Rank Weight Matrices Universal Approximators?
Tokio Kajitsuka, Issei Sato
MicroCinema: A Divide-and-Conquer Approach for Text-to-Video Generation
Yanhui Wang, Jianmin Bao, Wenming Weng et al.
Cascade-Zero123: One Image to Highly Consistent 3D with Self-Prompted Nearby Views
Yabo Chen, Jiemin Fang, Yuyang Huang et al.
LangCell: Language-Cell Pre-training for Cell Identity Understanding
Suyuan Zhao, Jiahuan Zhang, Yushuai Wu et al.
TagCLIP: A Local-to-Global Framework to Enhance Open-Vocabulary Multi-Label Classification of CLIP without Training
Yuqi Lin, Minghao Chen, Kaipeng Zhang et al.
TextCraftor: Your Text Encoder Can be Image Quality Controller
Yanyu Li, Xian Liu, Anil Kag et al.
Navigating Open Set Scenarios for Skeleton-Based Action Recognition
Kunyu Peng, Cheng Yin, Junwei Zheng et al.
Learning Unified Reference Representation for Unsupervised Multi-class Anomaly Detection
Liren He, Zhengkai Jiang, Jinlong Peng et al.
TEA: Test-time Energy Adaptation
Yige Yuan, Bingbing Xu, Liang Hou et al.
Say Anything with Any Style
Shuai Tan, Bin Ji, Yu Ding et al.
MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models
Justin Chih-Yao Chen, Swarnadeep Saha, Elias Stengel-Eskin et al.
Fast Context-Based Low-Light Image Enhancement via Neural Implicit Representations
Tomáš Chobola, Yu Liu, Hanyi Zhang et al.
AMD: Autoregressive Motion Diffusion
Bo Han, Hao Peng, Minjing Dong et al.
Towards Aligned Layout Generation via Diffusion Model with Aesthetic Constraints
Jian Chen, Ruiyi Zhang, Yufan Zhou et al.
Idempotence and Perceptual Image Compression
Tongda Xu, Ziran Zhu, Dailan He et al.
Sculpt3D: Multi-View Consistent Text-to-3D Generation with Sparse 3D Prior
Chen Cheng, Xiaofeng Yang, Fan Yang et al.
Zero-Shot Robustification of Zero-Shot Models
Dyah Adila, Changho Shin, Linrong Cai et al.
DGR-MIL: Exploring Diverse Global Representation in Multiple Instance Learning for Whole Slide Image Classification
Wenhui Zhu, Xiwen Chen, Peijie Qiu et al.
Cooper: Coordinating Specialized Agents towards a Complex Dialogue Goal
Yi Cheng, Wenge Liu, Jian Wang et al.
Skeleton-in-Context: Unified Skeleton Sequence Modeling with In-Context Learning
Xinshun Wang, Zhongbin Fang, Xia Li et al.
Generalized Schrödinger Bridge Matching
Guan-Horng Liu, Yaron Lipman, Maximilian Nickel et al.
GeneralAD: Anomaly Detection Across Domains by Attending to Distorted Features
Luc Sträter, Mohammadreza Salehi, Efstratios Gavves et al.
SpikeLM: Towards General Spike-Driven Language Modeling via Elastic Bi-Spiking Mechanisms
Xingrun Xing, Zheng Zhang, Ziyi Ni et al.
Auto-Prox: Training-Free Vision Transformer Architecture Search via Automatic Proxy Discovery
Zimian Wei, Peijie Dong, Zheng Hui et al.
VisionGraph: Leveraging Large Multimodal Models for Graph Theory Problems in Visual Context
yunxin li, Baotian Hu, Haoyuan Shi et al.
Each Test Image Deserves A Specific Prompt: Continual Test-Time Adaptation for 2D Medical Image Segmentation
Ziyang Chen, Yongsheng Pan, Yiwen Ye et al.
Occupancy as Set of Points
Yiang Shi, Tianheng Cheng, Qian Zhang et al.
OphNet: A Large-Scale Video Benchmark for Ophthalmic Surgical Workflow Understanding
Ming Hu, Peng Xia, Lin Wang et al.
Learning Modality-agnostic Representation for Semantic Segmentation from Any Modalities
Xu Zheng, Yuanhuiyi Lyu, LIN WANG
BEV-MAE: Bird’s Eye View Masked Autoencoders for Point Cloud Pre-training in Autonomous Driving Scenarios
ZhiWei Lin, Yongtao Wang, Shengxiang Qi et al.
HIG: Hierarchical Interlacement Graph Approach to Scene Graph Generation in Video Understanding
Trong-Thuan Nguyen, Pha Nguyen, Khoa Luu
Understanding Finetuning for Factual Knowledge Extraction
Gaurav Ghosal, Tatsunori Hashimoto, Aditi Raghunathan
Zero-shot Object Counting with Good Exemplars
Huilin Zhu, Jingling Yuan, Zhengwei Yang et al.
Energy-guided Entropic Neural Optimal Transport
Petr Mokrov, Alexander Korotin, Alexander Kolesov et al.
Text-to-3D Generation with Bidirectional Diffusion using both 2D and 3D priors
Lihe Ding, Shaocong Dong, Zhanpeng Huang et al.
DC-NAS: Divide-and-Conquer Neural Architecture Search for Multi-Modal Classification
Xinyan Liang, Pinhan Fu, Qian Guo et al.
HOIDiffusion: Generating Realistic 3D Hand-Object Interaction Data
Mengqi Zhang, Yang Fu, Zheng Ding et al.
Revisiting Non-Autoregressive Transformers for Efficient Image Synthesis
Zanlin Ni, Yulin Wang, Renping Zhou et al.
Analyzing Feed-Forward Blocks in Transformers through the Lens of Attention Maps
Goro Kobayashi, Tatsuki Kuribayashi, Sho Yokoi et al.
Unified Generative Modeling of 3D Molecules with Bayesian Flow Networks
Yuxuan Song, Jingjing Gong, Hao Zhou et al.
Masked Angle-Aware Autoencoder for Remote Sensing Images
Zhihao Li, Biao Hou, Siteng Ma et al.
The Nerfect Match: Exploring NeRF Features for Visual Localization
Qunjie Zhou, Maxim Maximov, Or Litany et al.
Attention Prompting on Image for Large Vision-Language Models
Runpeng Yu, Weihao Yu, Xinchao Wang
Generative Human Motion Stylization in Latent Space
chuan guo, Yuxuan Mu, Xinxin Zuo et al.
Automatic Radiology Reports Generation via Memory Alignment Network
Hongyu Shen, Mingtao Pei, Juncai Liu et al.
MeshAvatar: Learning High-quality Triangular Human Avatars from Multi-view Videos
Yushuo Chen, Zerong Zheng, Zhe Li et al.
Language-Driven Physics-Based Scene Synthesis and Editing via Feature Splatting
Ri-Zhao Qiu, Ge Yang, Weijia Zeng et al.
Compressed Context Memory for Online Language Model Interaction
Jang-Hyun Kim, Junyoung Yeom, Sangdoo Yun et al.
Sat2Scene: 3D Urban Scene Generation from Satellite Images with Diffusion
Zuoyue Li, Zhenqiang Li, Zhaopeng Cui et al.
DreamMotion: Space-Time Self-Similar Score Distillation for Zero-Shot Video Editing
Hyeonho Jeong, Jinho Chang, GEON YEONG PARK et al.
RLIF: Interactive Imitation Learning as Reinforcement Learning
Jianlan Luo, Perry Dong, Yuexiang Zhai et al.
Tuning-Free Image Customization with Image and Text Guidance
Pengzhi Li, Qiang Nie, Ying Chen et al.
AutoVP: An Automated Visual Prompting Framework and Benchmark
Hsi-Ai Tsao, Lei Hsiung, Pin-Yu Chen et al.
Uni-O4: Unifying Online and Offline Deep Reinforcement Learning with Multi-Step On-Policy Optimization
Kun LEI, Zhengmao He, Chenhao Lu et al.
ProMark: Proactive Diffusion Watermarking for Causal Attribution
Vishal Asnani, John Collomosse, Tu Bui et al.
Revisiting Plasticity in Visual Reinforcement Learning: Data, Modules and Training Stages
Guozheng Ma, Lu Li, Sen Zhang et al.
Prompt-Driven Dynamic Object-Centric Learning for Single Domain Generalization
Deng Li, Aming Wu, Yaowei Wang et al.
ViewFormer: Exploring Spatiotemporal Modeling for Multi-View 3D Occupancy Perception via View-Guided Transformers
Jinke Li, Xiao He, Chonghua Zhou et al.
Text-Conditioned Generative Model of 3D Strand-based Human Hairstyles
Vanessa Sklyarova, Egor Zakharov, Otmar Hilliges et al.
Adapting and Evaluating Influence-Estimation Methods for Gradient-Boosted Decision Trees
Jonathan Brophy, Zayd Hammoudeh, Daniel Lowd
Structuring Representation Geometry with Rotationally Equivariant Contrastive Learning
Sharut Gupta, Joshua Robinson, Derek Lim et al.
Rethinking Interactive Image Segmentation with Low Latency High Quality and Diverse Prompts
Qin Liu, Jaemin Cho, Mohit Bansal et al.
Beyond OOD State Actions: Supported Cross-Domain Offline Reinforcement Learning
Jinxin Liu, Ziqi Zhang, Zhenyu Wei et al.
Generalization Analysis of Machine Learning Algorithms via the Worst-Case Data-Generating Probability Measure
Xinying Zou, Samir Perlaza, Inaki Esnaola et al.
Dual Self-Paced Cross-Modal Hashing
Yuan Sun, Jian Dai, Zhenwen Ren et al.
Scaling Down Deep Learning with MNIST-1D
Sam Greydanus, Dmitry Kobak
Object-Aware Adaptive-Positivity Learning for Audio-Visual Question Answering
Zhangbin Li, Jinxing Zhou, Dan Guo et al.
Unmasking and Improving Data Credibility: A Study with Datasets for Training Harmless Language Models
Zhaowei Zhu, Jialu Wang, Hao Cheng et al.
BioBridge: Bridging Biomedical Foundation Models via Knowledge Graphs
Zifeng Wang, Zichen Wang, Balasubramaniam Srinivasan et al.
Entity-Centric Reinforcement Learning for Object Manipulation from Pixels
Dan Haramati, Tal Daniel, Aviv Tamar
Learning Correlation Structures for Vision Transformers
Manjin Kim, Paul Hongsuck Seo, Cordelia Schmid et al.
TIM: A Time Interval Machine for Audio-Visual Action Recognition
Jacob Chalk, Jaesung Huh, Evangelos Kazakos et al.
Predicting Emergent Abilities with Infinite Resolution Evaluation
Shengding Hu, Xin Liu, Xu Han et al.
Neural Network-Based Score Estimation in Diffusion Models: Optimization and Generalization
Yinbin Han, Meisam Razaviyayn, Renyuan Xu
Mean Teacher DETR with Masked Feature Alignment: A Robust Domain Adaptive Detection Transformer Framework
Weixi Weng, Chun Yuan
Dispel Darkness for Better Fusion: A Controllable Visual Enhancer based on Cross-modal Conditional Adversarial Learning
HAO ZHANG, Linfeng Tang, Xinyu Xiang et al.
Structured Chemistry Reasoning with Large Language Models
Siru Ouyang, Zhuosheng Zhang, Bing Yan et al.
A Multi-Modal Contrastive Diffusion Model for Therapeutic Peptide Generation
Yongkang Wang, Xuan Liu, Feng Huang et al.
A Sublinear Adversarial Training Algorithm
Yeqi Gao, Lianke Qin, Zhao Song et al.
Pseudo-Generalized Dynamic View Synthesis from a Video
Xiaoming Zhao, R Colburn, Fangchang Ma et al.
Code as Reward: Empowering Reinforcement Learning with VLMs
David Venuto, Mohammad Sami Nur Islam, Martin Klissarov et al.
SF(DA)$^2$: Source-free Domain Adaptation Through the Lens of Data Augmentation
Uiwon Hwang, Jonghyun Lee, Juhyeon Shin et al.
HyperFast: Instant Classification for Tabular Data
David Bonet, Daniel Mas Montserrat, Xavier Giró-i-Nieto et al.
Multi-modal Learning for Geospatial Vegetation Forecasting
Vitus Benson, Claire Robin, Christian Requena-Mesa et al.
CodeIt: Self-Improving Language Models with Prioritized Hindsight Replay
Natasha Butt, Blazej Manczak, Auke Wiggers et al.
Fusion Is Not Enough: Single Modal Attacks on Fusion Models for 3D Object Detection
Zhiyuan Cheng, Hongjun Choi, Shiwei Feng et al.
Distribution-aware Knowledge Prototyping for Non-exemplar Lifelong Person Re-identification
Kunlun Xu, Xu Zou, Yuxin Peng et al.
Revisiting the Last-Iterate Convergence of Stochastic Gradient Methods
Zijian Liu, Zhengyuan Zhou
No Time to Train: Empowering Non-Parametric Networks for Few-shot 3D Scene Segmentation
Xiangyang Zhu, Renrui Zhang, Bowei He et al.
Motif-Aware Riemannian Graph Neural Network with Generative-Contrastive Learning
Li Sun, Zhenhao Huang, Zixi Wang et al.
Strong Transferable Adversarial Attacks via Ensembled Asymptotically Normal Distribution Learning
Zhengwei Fang, Rui Wang, Tao Huang et al.
Adaptive Multi-Modal Cross-Entropy Loss for Stereo Matching
Peng Xu, Zhiyu Xiang, Chengyu Qiao et al.
Fool Your (Vision and) Language Model with Embarrassingly Simple Permutations
Yongshuo Zong, Tingyang Yu, Ruchika Chavhan et al.
Differentiable Weightless Neural Networks
Alan Bacellar, Zachary Susskind, Mauricio Breternitz Jr et al.
R&B: Region and Boundary Aware Zero-shot Grounded Text-to-image Generation
Jiayu Xiao, Henglei Lv, Henglei Lv et al.
Temporally Consistent Unbalanced Optimal Transport for Unsupervised Action Segmentation
Ming Xu, Stephen Gould
Safe-Sim: Safety-Critical Closed-Loop Traffic Simulation with Diffusion-Controllable Adversaries
WEI-JER Chang, Francesco Pittaluga, Masayoshi TOMIZUKA et al.
PICLe: Eliciting Diverse Behaviors from Large Language Models with Persona In-Context Learning
Hyeong Kyu Choi, Sharon Li
Efficient and Scalable Graph Generation through Iterative Local Expansion
Andreas Bergmeister, Karolis Martinkus, Nathanaël Perraudin et al.
Motion2VecSets: 4D Latent Vector Set Diffusion for Non-rigid Shape Reconstruction and Tracking
Wei Cao, Chang Luo, Biao Zhang et al.
Working Memory Capacity of ChatGPT: An Empirical Study
Dongyu Gong, Xingchen Wan, Dingmin Wang
Blind Image Quality Assessment Based on Geometric Order Learning
Nyeong-Ho Shin, Seon-Ho Lee, Chang-Su Kim
Continual-MAE: Adaptive Distribution Masked Autoencoders for Continual Test-Time Adaptation
Jiaming Liu, Ran Xu, Senqiao Yang et al.
How Realistic Is Your Synthetic Data? Constraining Deep Generative Models for Tabular Data
Mihaela Stoian, Salijona Dyrmishi, Maxime Cordy et al.
HAZARD Challenge: Embodied Decision Making in Dynamically Changing Environments
Qinhong Zhou, Sunli Chen, Yisong Wang et al.
Audio-Visual Segmentation via Unlabeled Frame Exploitation
Jinxiang Liu, Yikun Liu, Ferenas et al.
UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory
Haiwen Diao, Bo Wan, Ying Zhang et al.
FreeReg: Image-to-Point Cloud Registration Leveraging Pretrained Diffusion Models and Monocular Depth Estimators
Haiping Wang, Yuan Liu, Bing WANG et al.
KPConvX: Modernizing Kernel Point Convolution with Kernel Attention
Hugues Thomas, Yao-Hung Hubert Tsai, Timothy Barfoot et al.
Predictive Dynamic Fusion
Bing Cao, Yinan Xia, Yi Ding et al.
CODE REPRESENTATION LEARNING AT SCALE
Dejiao Zhang, Wasi Ahmad, Ming Tan et al.
eTag: Class-Incremental Learning via Embedding Distillation and Task-Oriented Generation
Libo Huang, Yan Zeng, Chuanguang Yang et al.
MART: MultiscAle Relational Transformer Networks for Multi-agent Trajectory Prediction
Seongju Lee, Junseok Lee, Yeonguk Yu et al.
Small Model Can Self-Correct
Haixia Han, Jiaqing Liang, Jie Shi et al.
Closing the Gap between TD Learning and Supervised Learning - A Generalisation Point of View.
Raj Ghugare, Matthieu Geist, Glen Berseth et al.
Position: Explain to Question not to Justify
Przemyslaw Biecek, Wojciech Samek
M&M VTO: Multi-Garment Virtual Try-On and Editing
Luyang Zhu, Yingwei Li, Nan Liu et al.
SAM-E: Leveraging Visual Foundation Model with Sequence Imitation for Embodied Manipulation
Junjie Zhang, Chenjia Bai, Haoran He et al.
FedDiv: Collaborative Noise Filtering for Federated Learning with Noisy Labels
Authors: Jichang Li, Guanbin Li, Hui Cheng et al.
Simulation of Graph Algorithms with Looped Transformers
Artur Back de Luca, Kimon Fountoulakis
Out-of-Distribution Detection in Long-Tailed Recognition with Calibrated Outlier Class Learning
Wenjun Miao, Guansong Pang, Xiao Bai et al.
FairTune: Optimizing Parameter Efficient Fine Tuning for Fairness in Medical Image Analysis
Raman Dutt, Ondrej Bohdal, Sotirios Tsaftaris et al.
Few-shot Class Incremental Learning with Attention-Aware Self-Adaptive Prompt
Chenxi Liu, Zhenyi Wang, Tianyi Xiong et al.
Sparse Global Matching for Video Frame Interpolation with Large Motion
Chunxu Liu, Guozhen Zhang, Rui Zhao et al.
In-Context Freeze-Thaw Bayesian Optimization for Hyperparameter Optimization
Herilalaina Rakotoarison, Steven Adriaensen, Neeratyoy Mallik et al.
Graph Neural Networks Use Graphs When They Shouldn't
Maya Bechler-Speicher, Ido Amos, Ran Gilad-Bachrach et al.
Evaluating and Analyzing Relationship Hallucinations in Large Vision-Language Models
Mingrui Wu, Jiayi Ji, Oucheng Huang et al.
Do text-free diffusion models learn discriminative visual representations?
Soumik Mukhopadhyay, Matthew Gwilliam, Yosuke Yamaguchi et al.
LQER: Low-Rank Quantization Error Reconstruction for LLMs
Cheng Zhang, Jianyi Cheng, George Constantinides et al.
Generative Region-Language Pretraining for Open-Ended Object Detection
Chuang Lin, Yi Jiang, Lizhen Qu et al.
SAM-COD: SAM-guided Unified Framework for Weakly-Supervised Camouflaged Object Detection
Huafeng Chen, Pengxu Wei, Guangqian Guo et al.
ScaleDreamer: Scalable Text-to-3D Synthesis with Asynchronous Score Distillation
Zhiyuan MA, Yuxiang WEI, Yabin Zhang et al.
Synthesize Step-by-Step: Tools Templates and LLMs as Data Generators for Reasoning-Based Chart VQA
Zhuowan Li, Bhavan Jasani, Peng Tang et al.
Multistain Pretraining for Slide Representation Learning in Pathology
Guillaume Jaume, Anurag J Vaidya, Andrew Zhang et al.
Towards a statistical theory of data selection under weak supervision
Germain Kolossov, Andrea Montanari, Pulkit Tandon
Contrastive Difference Predictive Coding
Chongyi Zheng, Ruslan Salakhutdinov, Benjamin Eysenbach
StableDrag: Stable Dragging for Point-based Image Editing
Yutao Cui, Xiaotong Zhao, Guozhen Zhang et al.
DiffPortrait3D: Controllable Diffusion for Zero-Shot Portrait View Synthesis
Yuming Gu, Hongyi Xu, You Xie et al.
FairDomain: Achieving Fairness in Cross-Domain Medical Image Segmentation and Classification
Yu Tian, Congcong Wen, Min Shi et al.
Analysis of Learning a Flow-based Generative Model from Limited Sample Complexity
Hugo Cui, Florent Krzakala, Eric Vanden-Eijnden et al.
Progressive Pretext Task Learning for Human Trajectory Prediction
Xiaotong Lin, Tianming Liang, Jian-Huang Lai et al.
Time-Series Forecasting for Out-of-Distribution Generalization Using Invariant Learning
Haoxin Liu, Harshavardhan Kamarthi, Lingkai Kong et al.
Delving into the Trajectory Long-tail Distribution for Muti-object Tracking
Sijia Chen, En Yu, Jinyang Li et al.
SYMBOL: Generating Flexible Black-Box Optimizers through Symbolic Equation Learning
Jiacheng Chen, Zeyuan Ma, Hongshu Guo et al.
Benchmarking Object Detectors with COCO: A New Path Forward
Shweta Singh, Aayan Yadav, Jitesh Jain et al.
MULAN: A Multi Layer Annotated Dataset for Controllable Text-to-Image Generation
Petru-Daniel Tudosiu, Yongxin Yang, Shifeng Zhang et al.
Mirasol3B: A Multimodal Autoregressive Model for Time-Aligned and Contextual Modalities
AJ Piergiovanni, Isaac Noble, Dahun Kim et al.
Supervised Anomaly Detection for Complex Industrial Images
Aimira Baitieva, David Hurych, Victor Besnier et al.
SWinGS: Sliding Windows for Dynamic 3D Gaussian Splatting
Richard Shaw, Michal Nazarczuk, Song Jifei et al.
Learning CNN on ViT: A Hybrid Model to Explicitly Class-specific Boundaries for Domain Adaptation
Ba Hung Ngo, Nhat-Tuong Do-Tran, Tuan-Ngoc Nguyen et al.
Seeing the Unseen: A Frequency Prompt Guided Transformer for Image Restoration
shihao zhou, Jinshan Pan, Jinglei Shi et al.
Improved baselines for vision-language pre-training
Jakob Verbeek, Enrico Fini, Michal Drozdzal et al.
PrE-Text: Training Language Models on Private Federated Data in the Age of LLMs
Charlie Hou, Akshat Shrivastava, Hongyuan Zhan et al.
Link-Context Learning for Multimodal LLMs
Yan Tai, Weichen Fan, Zhao Zhang et al.
ConvNet vs Transformer, Supervised vs CLIP: Beyond ImageNet Accuracy
Kirill Vishniakov, Zhiqiang Shen, Zhuang Liu
Instructive Decoding: Instruction-Tuned Large Language Models are Self-Refiner from Noisy Instructions
Taehyeon Kim, JOONKEE KIM, Gihun Lee et al.
Zero Bubble (Almost) Pipeline Parallelism
Penghui Qi, Xinyi Wan, Guangxing Huang et al.
FuRL: Visual-Language Models as Fuzzy Rewards for Reinforcement Learning
Yuwei Fu, Haichao Zhang, di wu et al.
Dolfin: Diffusion Layout Transformers without Autoencoder
Yilin Wang, Zeyuan Chen, Liangjun Zhong et al.
Statistical Perspective of Top-K Sparse Softmax Gating Mixture of Experts
Huy Nguyen, Pedram Akbarian Saravi, Fanqi Yan et al.
ZeroRF: Fast Sparse View 360° Reconstruction with Zero Pretraining
Ruoxi Shi, Xinyue Wei, Cheng Wang et al.
Selective Visual Representations Improve Convergence and Generalization for Embodied AI
Ainaz Eftekhar, Kuo-Hao Zeng, Jiafei Duan et al.
Use Your INSTINCT: INSTruction optimization for LLMs usIng Neural bandits Coupled with Transformers
Xiaoqiang Lin, Zhaoxuan Wu, Zhongxiang Dai et al.
BadRL: Sparse Targeted Backdoor Attack against Reinforcement Learning
Jing Cui, Yufei Han, Yuzhe Ma et al.
Make-Your-3D: Fast and Consistent Subject-Driven 3D Content Generation
Fangfu Liu, Hanyang Wang, Weiliang Chen et al.
Equity-Transformer: Solving NP-Hard Min-Max Routing Problems as Sequential Generation with Equity Context
Jiwoo Son, Minsu Kim, Sanghyeok Choi et al.
Efficient Video Diffusion Models via Content-Frame Motion-Latent Decomposition
Sihyun Yu, Weili Nie, De-An Huang et al.
CPR: Retrieval Augmented Generation for Copyright Protection
Aditya Golatkar, Alessandro Achille, Luca Zancato et al.
Emergence of In-Context Reinforcement Learning from Noise Distillation
Ilya Zisman, Vladislav Kurenkov, Alexander Nikulin et al.
Decomposed Linear Dynamical Systems (dLDS) for learning the latent components of neural dynamics
Noga Mudrik, Yenho Chen, Eva Yezerets et al.
2382 SSMG: Spatial-Semantic Map Guided Diffusion Model for Free-Form Layout-to-Image Generation
Chengyou Jia, Minnan Luo, Zhuohang Dang et al.
Probabilistically Rewired Message-Passing Neural Networks
Chendi Qian, Andrei Manolache, Kareem Ahmed et al.
Boosting Spike Camera Image Reconstruction from a Perspective of Dealing with Spike Fluctuations
Rui Zhao, Ruiqin Xiong, Jing Zhao et al.
Beyond task performance: evaluating and reducing the flaws of large multimodal models with in-context-learning
Mustafa Shukor, Alexandre Rame, Corentin Dancette et al.
Towards Robust Offline Reinforcement Learning under Diverse Data Corruption
Rui Yang, Han Zhong, Jiawei Xu et al.
SignAvatars: A Large-scale 3D Sign Language Holistic Motion Dataset and Benchmark
Zhengdi Yu, Shaoli Huang, yongkang cheng et al.
DS-NeRV: Implicit Neural Video Representation with Decomposed Static and Dynamic Codes
Hao Yan, Zhihui Ke, Xiaobo Zhou et al.
FRED: Towards a Full Rotation-Equivariance in Aerial Image Object Detection
Chanho Lee, Jinsu Son, Hyounguk Shon et al.
DENEVIL: TOWARDS DECIPHERING AND NAVIGATING THE ETHICAL VALUES OF LARGE LANGUAGE MODELS VIA INSTRUCTION LEARNING
Shitong Duan, Xiaoyuan Yi, Peng Zhang et al.
ACT: Empowering Decision Transformer with Dynamic Programming via Advantage Conditioning
Chen-Xiao Gao, Chenyang Wu, Mingjun Cao et al.
Outlier-robust Kalman Filtering through Generalised Bayes
Gerardo Duran-Martin, Matias Altamirano, Alex Shestopaloff et al.
Getting it Right: Improving Spatial Consistency in Text-to-Image Models
Agneet Chatterjee, Gabriela Ben Melech Stan, Estelle Guez Aflalo et al.
Learning to design protein-protein interactions with enhanced generalization
Anton Bushuiev, Roman Bushuiev, Petr Kouba et al.
MiKASA: Multi-Key-Anchor & Scene-Aware Transformer for 3D Visual Grounding
Chun-Peng Chang, Shaoxiang Wang, Alain Pagani et al.
Multi-Class Support Vector Machine with Maximizing Minimum Margin
Feiping Nie, Zhezheng Hao, Rong Wang
Cooperation Does Matter: Exploring Multi-Order Bilateral Relations for Audio-Visual Segmentation
Qi Yang, Xing Nie, Tong Li et al.
Text-Based Occluded Person Re-identification via Multi-Granularity Contrastive Consistency Learning
Xinyi Wu, Wentao Ma, Dan Guo et al.