Most Cited 2024 "dependent couplings" Papers
12,324 papers found • Page 20 of 62
Conference
Restore Anything with Masks: Leveraging Mask Image Modeling for Blind All-in-One Image Restoration
Chu Jie Qin, Ruiqi Wu, Zikun Liu et al.
Question Calibration and Multi-Hop Modeling for Temporal Question Answering
Chao Xue, Di Liang, Pengfei Wang et al.
EDA: Evolving and Distinct Anchors for Multimodal Motion Prediction
Longzhong Lin, Xuewu Lin, Tianwei Lin et al.
Moreau Envelope for Nonconvex Bi-Level Optimization: A Single-Loop and Hessian-Free Solution Strategy
Risheng Liu, Zhu Liu, Wei Yao et al.
Quantifying and Analyzing Entity-Level Memorization in Large Language Models
Zhenhong Zhou, Jiuyang Xiang, Chaomeng Chen et al.
Navigation Instruction Generation with BEV Perception and Large Language Models
Sheng Fan, Rui Liu, Wenguan Wang et al.
Spectral Meets Spatial: Harmonising 3D Shape Matching and Interpolation
Dongliang Cao, Marvin Eisenberger, Nafie El Amrani et al.
SAVSR: Arbitrary-Scale Video Super-resolution via a Learned Scale-Adaptive Network
Zekun Li, Hongying Liu, Fanhua Shang et al.
WTS: A Pedestrian-Centric Traffic Video Dataset for Fine-grained Spatial-Temporal Understanding
Quan Kong, Yuki Kawana, Rajat Saini et al.
Few-Shot Character Understanding in Movies as an Assessment to Meta-Learning of Theory-of-Mind
Mo Yu, Qiujing Wang, Shunchi Zhang et al.
Learning to Adapt SAM for Segmenting Cross-domain Point Clouds
Xidong Peng, Runnan Chen, Feng Qiao et al.
Implicit regularization of deep residual networks towards neural ODEs
Pierre Marion, Yu-Han Wu, Michael Sander et al.
Purify Unlearnable Examples via Rate-Constrained Variational Autoencoders
Yi Yu, Yufei Wang, Song Xia et al.
SCHEMA: State CHangEs MAtter for Procedure Planning in Instructional Videos
Yulei Niu, Wenliang Guo, Long Chen et al.
You'll Never Walk Alone: A Sketch and Text Duet for Fine-Grained Image Retrieval
Subhadeep Koley, Ayan Kumar Bhunia, Aneeshan Sain et al.
CadVLM: Bridging Language and Vision in the Generation of Parametric CAD Sketches
Sifan Wu, Amir Hosein Khasahmadi, Mor Katz et al.
Action2Sound: Ambient-Aware Generation of Action Sounds from Egocentric Videos
Changan Chen, Puyuan Peng, Ami Baid et al.
IIANet: An Intra- and Inter-Modality Attention Network for Audio-Visual Speech Separation
Kai Li, Runxuan Yang, Fuchun Sun et al.
Long-Tailed Anomaly Detection with Learnable Class Names
Chih-Hui Ho, Kuan-Chuan Peng, Nuno Vasconcelos
Adaptive FSS: A Novel Few-Shot Segmentation Framework via Prototype Enhancement
Jing Wang, Jiangyun Li, Chen Chen et al.
Unmixing Diffusion for Self-Supervised Hyperspectral Image Denoising
Haijin Zeng, Jiezhang Cao, Yongyong Chen et al.
STAR: Boosting Low-Resource Information Extraction by Structure-to-Text Data Generation with Large Language Models
Mingyu Derek Ma, Xiaoxuan Wang, Po-Nien Kung et al.
HR-Pro: Point-Supervised Temporal Action Localization via Hierarchical Reliability Propagation
Huaxin Zhang, Xiang Wang, Xiaohao Xu et al.
3D Feature Prediction for Masked-AutoEncoder-Based Point Cloud Pretraining
Siming Yan, Yuqi Yang, Yu-Xiao Guo et al.
Aligning Geometric Spatial Layout in Cross-View Geo-Localization via Feature Recombination
Qingwang Zhang, Yingying Zhu
Pre-Training and Fine-Tuning Generative Flow Networks
Ling Pan, Moksh Jain, Kanika Madan et al.
ConjNorm: Tractable Density Estimation for Out-of-Distribution Detection
Bo Peng, Yadan Luo, Yonggang Zhang et al.
Continuous Field Reconstruction from Sparse Observations with Implicit Neural Networks
Xihaier Luo, Wei Xu, Balasubramanya T. Nadiga et al.
Steerers: A Framework for Rotation Equivariant Keypoint Descriptors
Georg Bökman, Johan Edstedt, Michael Felsberg et al.
GEARS: Local Geometry-aware Hand-object Interaction Synthesis
Keyang Zhou, Bharat Lal Bhatnagar, Jan Lenssen et al.
Self-Supervised Multi-Object Tracking with Path Consistency
Zijia Lu, Bing Shuai, Yanbei Chen et al.
Composed Video Retrieval via Enriched Context and Discriminative Embeddings
Omkar Thawakar, Muzammal Naseer, Rao Anwer et al.
360Loc: A Dataset and Benchmark for Omnidirectional Visual Localization with Cross-device Queries
Huajian Huang, Changkun Liu, Yipeng Zhu et al.
LAN: Learning to Adapt Noise for Image Denoising
Changjin Kim, Tae Hyun Kim, Sungyong Baik
How Over-Parameterization Slows Down Gradient Descent in Matrix Sensing: The Curses of Symmetry and Initialization
Nuoya Xiong, Lijun Ding, Simon Du
Instrumental Variable Estimation for Causal Inference in Longitudinal Data with Time-Dependent Latent Confounders
Debo Cheng, Ziqi Xu, Jiuyong Li et al.
Neural Spline Fields for Burst Image Fusion and Layer Separation
Ilya Chugunov, David Shustin, Ruyu Yan et al.
Understanding the Effects of Iterative Prompting on Truthfulness
Satyapriya Krishna, Chirag Agarwal, Himabindu Lakkaraju
EditShield: Protecting Unauthorized Image Editing by Instruction-guided Diffusion Models
Ruoxi Chen, Haibo Jin, Yixin Liu et al.
InteractDiffusion: Interaction Control in Text-to-Image Diffusion Models
Jiun Tian Hoe, Xudong Jiang, Chee Seng Chan et al.
HoloNets: Spectral Convolutions do extend to Directed Graphs
Christian Koke, Daniel Cremers
PrPSeg: Universal Proposition Learning for Panoramic Renal Pathology Segmentation
Ruining Deng, Quan Liu, Can Cui et al.
Dual Prototype Attention for Unsupervised Video Object Segmentation
Suhwan Cho, Minhyeok Lee, Seunghoon Lee et al.
Guided Slot Attention for Unsupervised Video Object Segmentation
Minhyeok Lee, Suhwan Cho, Dogyoon Lee et al.
Transformers Provably Learn Sparse Token Selection While Fully-Connected Nets Cannot
Zixuan Wang, Stanley Wei, Daniel Hsu et al.
Boosting Neural Cognitive Diagnosis with Student’s Affective State Modeling
Shanshan Wang, Zhen Zeng, Xun Yang et al.
LayoutFlow: Flow Matching for Layout Generation
Julian Jorge Andrade Guerreiro, Naoto Inoue, Kento Masui et al.
Is Retain Set All You Need in Machine Unlearning? Restoring Performance of Unlearned Models with Out-Of-Distribution Images
Jacopo Bonato, Marco Cotogni, Luigi Sabetta
Elucidating the design space of classifier-guided diffusion generation
Jiajun Ma, Tianyang Hu, Wenjia Wang et al.
SAGS: Structure-Aware 3D Gaussian Splatting
Evangelos Ververas, Rolandos Alexandros Potamias, Song Jifei et al.
Seizing Serendipity: Exploiting the Value of Past Success in Off-Policy Actor-Critic
Tianying Ji, Yu Luo, Fuchun Sun et al.
Clustering Propagation for Universal Medical Image Segmentation
Yuhang Ding, Liulei Li, Wenguan Wang et al.
Memorization Through the Lens of Curvature of Loss Function Around Samples
Isha Garg, Deepak Ravikumar, Kaushik Roy
Match-Stereo-Videos: Bidirectional Alignment for Consistent Dynamic Stereo Matching
Junpeng Jing, Ye Mao, Krystian Mikolajczyk
Visible and Clear: Finding Tiny Objects in Difference Map
Bing Cao, Haiyu Yao, Pengfei Zhu et al.
Residual Quantization with Implicit Neural Codebooks
Iris Huijben, Matthijs Douze, Matthew Muckley et al.
FedRA: A Random Allocation Strategy for Federated Tuning to Unleash the Power of Heterogeneous Clients
Shangchao Su, Bin Li, Xiangyang Xue
PerAda: Parameter-Efficient Federated Learning Personalization with Generalization Guarantees
Chulin Xie, De-An Huang, Wenda Chu et al.
ExeDec: Execution Decomposition for Compositional Generalization in Neural Program Synthesis
Kensen Shi, Joey Hong, Yinlin Deng et al.
WildVidFit: Video Virtual Try-On in the Wild via Image-Based Controlled Diffusion Models
Zijian He, Peixin Chen, Guangrun Wang et al.
Protecting NeRFs' Copyright via Plug-And-Play Watermarking Base Model
Qi Song, Ziyuan Luo, Ka Chun Cheung et al.
When Pedestrian Detection Meets Multi-Modal Learning: Generalist Model and Benchmark Dataset
Yi Zhang, Wang Zeng, Sheng Jin et al.
Textual Query-Driven Mask Transformer for Domain Generalized Segmentation
Byeonghyun Pak, Byeongju Woo, Sunghwan Kim et al.
A Simple Baseline for Spoken Language to Sign Language Translation with 3D Avatars
Ronglai Zuo, Fangyun Wei, Zenggui Chen et al.
UVEB: A Large-scale Benchmark and Baseline Towards Real-World Underwater Video Enhancement
yaofeng xie, Lingwei Kong, Kai Chen et al.
Surface Reconstruction for 3D Gaussian Splatting via Local Structural Hints
Qianyi Wu, Jianmin Zheng, Jianfei Cai
Unsupervised Evaluation of Code LLMs with Round-Trip Correctness
Miltiadis Allamanis, Sheena Panthaplackel, Pengcheng Yin
BA-SAM: Scalable Bias-Mode Attention Mask for Segment Anything Model
song yiran, Qianyu Zhou, Xiangtai Li et al.
BEVSpread: Spread Voxel Pooling for Bird’s-Eye-View Representation in Vision-based Roadside 3D Object Detection
Wenjie Wang, Yehao Lu, Guangcong Zheng et al.
Exploring Diverse Representations for Open Set Recognition
Yu Wang, Junxian Mu, Pengfei Zhu et al.
HandDiff: 3D Hand Pose Estimation with Diffusion on Image-Point Cloud
WENCAN CHENG, Hao Tang, Luc Van Gool et al.
Investigating the Benefits of Projection Head for Representation Learning
Yihao Xue, Eric Gan, Jiayi Ni et al.
ProS: Prompting-to-simulate Generalized knowledge for Universal Cross-Domain Retrieval
Fang Kaipeng, Jingkuan Song, Lianli Gao et al.
Correspondence-Free Non-Rigid Point Set Registration Using Unsupervised Clustering Analysis
Mingyang Zhao, Jiang Jingen, Lei Ma et al.
Making Vision Transformers Truly Shift-Equivariant
Renan A. Rojas-Gomez, Teck-Yian Lim, Minh Do et al.
SmartControl: Enhancing ControlNet for Handling Rough Visual Conditions
XIAOYU LIU, Yuxiang WEI, Ming LIU et al.
Simplifying Source-Free Domain Adaptation for Object Detection: Effective Self-Training Strategies and Performance Insights
Yan Hao, Florent Forest, Olga Fink
Adapting to Distribution Shift by Visual Domain Prompt Generation
Zhixiang Chi, Li Gu, Tao Zhong et al.
ColorPeel: Color Prompt Learning with Diffusion Models via Color and Shape Disentanglement
Muhammad Atif Butt, Kai Wang, Javier Vazquez-Corral et al.
Discovering Symmetry Breaking in Physical Systems with Relaxed Group Convolution
Rui Wang, Elyssa Hofgard, Han Gao et al.
Improving Transferable Targeted Adversarial Attacks with Model Self-Enhancement
Han Wu, Guanyan Ou, Weibin Wu et al.
Dexterous Grasp Transformer
Guo-Hao Xu, Yi-Lin Wei, Dian Zheng et al.
On the Variance of Neural Network Training with respect to Test Sets and Distributions
Keller Jordan
FLAME: A Small Language Model for Spreadsheet Formulas
Harshit Joshi, José Cambronero Sanchez, Abishai Ebenezer et al.
MESED: A Multi-Modal Entity Set Expansion Dataset with Fine-Grained Semantic Classes and Hard Negative Entities
Li Yangning, Tingwei Lu, Hai-Tao Zheng et al.
Dual Prior Unfolding for Snapshot Compressive Imaging
Jiancheng Zhang, Haijin Zeng, Jiezhang Cao et al.
Weakly-Supervised Emotion Transition Learning for Diverse 3D Co-speech Gesture Generation
Xingqun Qi, Jiahao Pan, Peng Li et al.
Unsupervised Semantic Segmentation Through Depth-Guided Feature Correlation and Sampling
Leon Sick, Dominik Engel, Pedro Hermosilla et al.
MIntRec2.0: A Large-scale Benchmark Dataset for Multimodal Intent Recognition and Out-of-scope Detection in Conversations
Hanlei Zhang, Xin Wang, Hua Xu et al.
PORF: POSE RESIDUAL FIELD FOR ACCURATE NEURAL SURFACE RECONSTRUCTION
Jia-Wang Bian, Wenjing Bian, Victor Prisacariu et al.
Open-Set Domain Adaptation for Semantic Segmentation
Seun-An Choe, Ah-Hyung Shin, Keon Hee Park et al.
Unsupervised Zero-Shot Reinforcement Learning via Functional Reward Encodings
Kevin Frans, Seohong Park, Pieter Abbeel et al.
Interpretability Illusions in the Generalization of Simplified Models
Dan Friedman, Andrew Lampinen, Lucas Dixon et al.
Modeling Collaborator: Enabling Subjective Vision Classification With Minimal Human Effort via LLM Tool-Use
Imad Eddine Toubal, Aditya Avinash, Neil Alldrin et al.
iHuman: Instant Animatable Digital Humans From Monocular Videos
Pramish Paudel, Anubhav Khanal, Danda Pani Paudel et al.
FedA3I: Annotation Quality-Aware Aggregation for Federated Medical Image Segmentation against Heterogeneous Annotation Noise
Nannan Wu, Zhaobin Sun, Zengqiang Yan et al.
Pre-training Sequence, Structure, and Surface Features for Comprehensive Protein Representation Learning
Youhan Lee, Hasun Yu, Jaemyung Lee et al.
HGE: Embedding Temporal Knowledge Graphs in a Product Space of Heterogeneous Geometric Subspaces
Jiaxin Pan, Mojtaba Nayyeri, Yinan Li et al.
WordRobe: Text-Guided Generation of Textured 3D Garments
Astitva Srivastava, Pranav Manu, Amit Raj et al.
Open-ended VQA benchmarking of Vision-Language models by exploiting Classification datasets and their semantic hierarchy
Simon Ging, Maria A. Bravo, Thomas Brox
Underwater Organism Color Fine-Tuning via Decomposition and Guidance
Xiaofeng Cong, Jie Gui, Junming Hou
One-Class Face Anti-spoofing via Spoof Cue Map-Guided Feature Learning
Pei-Kai Huang, Cheng-Hsuan Chiang, Tzu-Hsien Chen et al.
ED-NeRF: Efficient Text-Guided Editing of 3D Scene With Latent Space NeRF
Jangho Park, Gihyun Kwon, Jong Chul YE
Improved Self-Training for Test-Time Adaptation
Jing Ma
Clifford Group Equivariant Simplicial Message Passing Networks
Cong Liu, David Ruhe, Floor Eijkelboom et al.
AlignMiF: Geometry-Aligned Multimodal Implicit Field for LiDAR-Camera Joint Synthesis
Tao Tang, Guangrun Wang, Yixing Lao et al.
Theoretically Achieving Continuous Representation of Oriented Bounding Boxes
Zikai Xiao, Guo-Ye Yang, Xue Yang et al.
Improving Video Segmentation via Dynamic Anchor Queries
Yikang Zhou, Tao Zhang, Xiangtai Li et al.
Reward Model Learning vs. Direct Policy Optimization: A Comparative Analysis of Learning from Human Preferences
Andi Nika, Debmalya Mandal, Parameswaran Kamalaruban et al.
Improved Graph Contrastive Learning for Short Text Classification
Yonghao Liu, Lan Huang, Fausto Giunchiglia et al.
Robust Image Denoising through Adversarial Frequency Mixup
Donghun Ryou, Inju Ha, Hyewon Yoo et al.
SPIRE: Semantic Prompt-Driven Image Restoration
Chenyang Qi, Zhengzhong Tu, Keren Ye et al.
Embarrassingly Simple Dataset Distillation
Yunzhen Feng, Shanmukha Ramakrishna Vedantam, Julia Kempe
Revisit Human-Scene Interaction via Space Occupancy
Xinpeng Liu, Haowen Hou, Yanchao Yang et al.
Mixed-Precision Quantization for Federated Learning on Resource-Constrained Heterogeneous Devices
Huancheng Chen, Haris Vikalo
Unsupervised Occupancy Learning from Sparse Point Cloud
Amine Ouasfi, Adnane Boukhayma
TexGen: Text-Guided 3D Texture Generation with Multi-view Sampling and Resampling
Dong Huo, Zixin Guo, Xinxin Zuo et al.
Federated Q-Learning: Linear Regret Speedup with Low Communication Cost
Zhong Zheng, Fengyu Gao, Lingzhou Xue et al.
Zero-Shot Aerial Object Detection with Visual Description Regularization
Chenyu Lin, Zhengqing Zang, Chenwei Tang et al.
Any-Shift Prompting for Generalization over Distributions
Zehao Xiao, Jiayi Shen, Mohammad Mahdi Derakhshani et al.
Leaving the Nest: Going beyond Local Loss Functions for Predict-Then-Optimize
Sanket Shah, Bryan Wilder, Andrew Perrault et al.
CLIPZyme: Reaction-Conditioned Virtual Screening of Enzymes
Peter Mikhael, Itamar Chinn, Regina Barzilay
Understanding Transferable Representation Learning and Zero-shot Transfer in CLIP
Zixiang Chen, Yihe Deng, Yuanzhi Li et al.
Graph-based Forecasting with Missing Data through Spatiotemporal Downsampling
Ivan Marisca, Cesare Alippi, Filippo Maria Bianchi
Improving Zero-shot Generalization of Learned Prompts via Unsupervised Knowledge Distillation
Marco Mistretta, Alberto Baldrati, Marco Bertini et al.
Adapting Short-Term Transformers for Action Detection in Untrimmed Videos
Min Yang, gaohuan, Ping Guo et al.
BECoTTA: Input-dependent Online Blending of Experts for Continual Test-time Adaptation
Daeun Lee, Jaehong Yoon, Sung Ju Hwang
Offline Model-Based Optimization via Policy-Guided Gradient Search
Yassine Chemingui, Aryan Deshwal, Nghia Hoang et al.
TaskLAMA: Probing the Complex Task Understanding of Language Models
Quan Yuan, Mehran Kazemi, Xin Xu et al.
Improving Cross-Modal Alignment with Synthetic Pairs for Text-Only Image Captioning
Zhiyue Liu, Jinyuan Liu, Fanrong Ma
Deep Orthogonal Hypersphere Compression for Anomaly Detection
Yunhe Zhang, Yan Sun, Jinyu Cai et al.
Faces that Speak: Jointly Synthesising Talking Face and Speech from Text
Youngjoon Jang, Jihoon Kim, Junseok Ahn et al.
CL2CM: Improving Cross-Lingual Cross-Modal Retrieval via Cross-Lingual Knowledge Transfer
Yabing Wang, Fan Wang, Jianfeng Dong et al.
SmartMask: Context Aware High-Fidelity Mask Generation for Fine-grained Object Insertion and Layout Control
Jaskirat Singh, Jianming Zhang, Qing Liu et al.
DIBS: Enhancing Dense Video Captioning with Unlabeled Videos via Pseudo Boundary Enrichment and Online Refinement
Hao Wu, Huabin Liu, Yu Qiao et al.
Open-World Semantic Segmentation Including Class Similarity
Matteo Sodano, Federico Magistri, Lucas Nunes et al.
SelfPose3d: Self-Supervised Multi-Person Multi-View 3d Pose Estimation
Keqi Chen, vinkle srivastav, Nicolas Padoy
ODIN: A Single Model for 2D and 3D Segmentation
Ayush Jain, Pushkal Katara, Nikolaos Gkanatsios et al.
MMVR: Millimeter-wave Multi-View Radar Dataset and Benchmark for Indoor Perception
Mohammad Mahbubur Rahman, Ryoma Yataka, Sorachi Kato et al.
What Can Transformer Learn with Varying Depth? Case Studies on Sequence Learning Tasks
Xingwu Chen, Difan Zou
HypeBoy: Generative Self-Supervised Representation Learning on Hypergraphs
Sunwoo Kim, Shinhwan Kang, Fanchen Bu et al.
Image-Text Co-Decomposition for Text-Supervised Semantic Segmentation
Ji-Jia Wu, Andy Chia-Hao Chang, Chieh-Yu Chuang et al.
Loose Inertial Poser: Motion Capture with IMU-attached Loose-Wear Jacket
Chengxu Zuo, Yiming Wang, Lishuang Zhan et al.
Hierarchical Patch Diffusion Models for High-Resolution Video Generation
Ivan Skorokhodov, Willi Menapace, Aliaksandr Siarohin et al.
ChAda-ViT : Channel Adaptive Attention for Joint Representation Learning of Heterogeneous Microscopy Images
Nicolas Bourriez, Ihab Bendidi, Cohen Ethan et al.
EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE
Junyi Chen, Longteng Guo, Jia Sun et al.
Graph-based Time Series Clustering for End-to-End Hierarchical Forecasting
Andrea Cini, Danilo Mandic, Cesare Alippi
FedFixer: Mitigating Heterogeneous Label Noise in Federated Learning
Xinyuan Ji, Zhaowei Zhu, Wei Xi et al.
ImageNet-OOD: Deciphering Modern Out-of-Distribution Detection Algorithms
William Yang, Byron Zhang, Olga Russakovsky
A General Framework for User-Guided Bayesian Optimization
Carl Hvarfner, Frank Hutter, Luigi Nardi
Regressor-Segmenter Mutual Prompt Learning for Crowd Counting
Mingyue Guo, Li Yuan, Zhaoyi Yan et al.
Neural Networks Learn Statistics of Increasing Complexity
Nora Belrose, Quintin Pope, Lucia Quirke et al.
Genuine Knowledge from Practice: Diffusion Test-Time Adaptation for Video Adverse Weather Removal
Yijun Yang, Hongtao Wu, Angelica I. Aviles-Rivero et al.
CVT-Occ: Cost Volume Temporal Fusion for 3D Occupancy Prediction
Zhangchen Ye, Tao Jiang, Chenfeng Xu et al.
Personalized LoRA for Human-Centered Text Understanding
You Zhang, Jin Wang, Liang-Chih Yu et al.
MirrorGaussian: Reflecting 3D Gaussians for Reconstructing Mirror Reflections
Jiayue Liu, Tang Xiao, Freeman Cheng et al.
CLiC: Concept Learning in Context
Mehdi Safaee, Aryan Mikaeili, Or Patashnik et al.
Learning to Predict Activity Progress by Self-Supervised Video Alignment
Gerard Donahue, Ehsan Elhamifar
ConR: Contrastive Regularizer for Deep Imbalanced Regression
Mahsa Keramati, Lili Meng, R. Evans
Upper Bounding Barlow Twins: A Novel Filter for Multi-Relational Clustering
Xiaowei Qian, Bingheng Li, Zhao Kang
GNNX-BENCH: Unravelling the Utility of Perturbation-based GNN Explainers through In-depth Benchmarking
Mert Kosan, Samidha Verma, Burouj Armgaan et al.
Viewpoint-Aware Visual Grounding in 3D Scenes
Xiangxi Shi, Zhonghua Wu, Stefan Lee
Generating and Reweighting Dense Contrastive Patterns for Unsupervised Anomaly Detection
Songmin Dai, Yifan Wu, Xiaoqiang Li et al.
Joint Reconstruction of 3D Human and Object via Contact-Based Refinement Transformer
Hyeongjin Nam, Daniel Jung, Gyeongsik Moon et al.
Towards Efficient Diffusion-Based Image Editing with Instant Attention Masks
Siyu Zou, Jiji Tang, Yiyi Zhou et al.
Gaussian Processes on Cellular Complexes
Mathieu Alain, So Takao, Brooks Paige et al.
Exploring the Promise and Limits of Real-Time Recurrent Learning
Kazuki Irie, Anand Gopalakrishnan, Jürgen Schmidhuber
You Only Need One Step: Fast Super-Resolution with Stable Diffusion via Scale Distillation
Mehdi Noroozi, Isma Hadji, Brais Martinez et al.
Discovering modular solutions that generalize compositionally
Simon Schug, Seijin Kobayashi, Yassir Akram et al.
Conformal Autoregressive Generation: Beam Search with Coverage Guarantees
Nicolas Deutschmann, Marvin Alberts, María Rodríguez Martínez
TAMM: TriAdapter Multi-Modal Learning for 3D Shape Understanding
Zhihao Zhang, Shengcao Cao, Yu-Xiong Wang
FlowTrack: Revisiting Optical Flow for Long-Range Dense Tracking
Seokju Cho, Gabriel Huang, Seungryong Kim et al.
TE-TAD: Towards Full End-to-End Temporal Action Detection via Time-Aligned Coordinate Expression
Ho-Joong Kim, Jung-Ho Hong, Heejo Kong et al.
Symmetry Induces Structure and Constraint of Learning
Liu Ziyin
ExpCLIP: Bridging Text and Facial Expressions via Semantic Alignment
Yicheng Zhong, Huawei Wei, Peiji Yang et al.
Layer-wise linear mode connectivity
Linara Adilova, Maksym Andriushchenko, Michael Kamp et al.
StableMask: Refining Causal Masking in Decoder-only Transformer
Qingyu Yin, Xuzheng He, Xiang Zhuang et al.
Cascade-CLIP: Cascaded Vision-Language Embeddings Alignment for Zero-Shot Semantic Segmentation
Yunheng Li, Zhong-Yu Li, Quan-Sheng Zeng et al.
Homophily-Related: Adaptive Hybrid Graph Filter for Multi-View Graph Clustering
Zichen Wen, Yawen Ling, Yazhou Ren et al.
Towards Squeezing-Averse Virtual Try-On via Sequential Deformation
Sang-Heon Shim, Jiwoo Chung, Jae-Pil Heo
From Latent Graph to Latent Topology Inference: Differentiable Cell Complex Module
Claudio Battiloro, Indro Spinelli, Lev Telyatinkov et al.
Is Temperature Sample Efficient for Softmax Gaussian Mixture of Experts?
Huy Nguyen, Pedram Akbarian, Nhat Ho
OED: Towards One-stage End-to-End Dynamic Scene Graph Generation
Guan Wang, Zhimin Li, Qingchao Chen et al.
RMem: Restricted Memory Banks Improve Video Object Segmentation
Junbao Zhou, Ziqi Pang, Yu-Xiong Wang
Nevis'22: A Stream of 100 Tasks Sampled from 30 Years of Computer Vision Research
Jorg Bornschein, Alexandre Galashov, Ross Hemsley et al.
Customizing Language Model Responses with Contrastive In-Context Learning
Xiang Gao, Kamalika Das
Gaussian Frosting: Editable Complex Radiance Fields with Real-Time Rendering
Antoine Guedon, Vincent Lepetit
SpikeZIP-TF: Conversion is All You Need for Transformer-based SNN
kang you, Zekai Xu, Chen Nie et al.
Towards Open Domain Text-Driven Synthesis of Multi-Person Motions
Shan Mengyi, Lu Dong, Yutao Han et al.
NeRF-HuGS: Improved Neural Radiance Fields in Non-static Scenes Using Heuristics-Guided Segmentation
Jiahao Chen, Yipeng Qin, Lingjie Liu et al.
LF-ViT: Reducing Spatial Redundancy in Vision Transformer for Efficient Image Recognition
Youbing Hu, Yun Cheng, Anqi Lu et al.
Leveraging Partial Symmetry for Multi-Agent Reinforcement Learning
Xin Yu, Rongye Shi, Pu Feng et al.
A Graph-Based Approach for Category-Agnostic Pose Estimation
Or Hirschorn, Shai Avidan
MoMo: Momentum Models for Adaptive Learning Rates
Fabian Schaipp, Ruben Ohana, Michael Eickenberg et al.
Fast View Synthesis of Casual Videos with Soup-of-Planes
Yao-Chih Lee, Zhoutong Zhang, Kevin Blackburn-Matzen et al.
Beat-It: Beat-Synchronized Multi-Condition 3D Dance Generation
Zikai Huang, Xuemiao Xu, Cheng Xu et al.
MaskDiff: Modeling Mask Distribution with Diffusion Probabilistic Model for Few-Shot Instance Segmentation
Authors: Minh-Quan Le, Tam Nguyen, Trung-Nghia Le et al.
Searching for High-Value Molecules Using Reinforcement Learning and Transformers
Raj Ghugare, Santiago Miret, Adriana Hugessen et al.
How to Trace Latent Generative Model Generated Images without Artificial Watermark?
Zhenting Wang, Vikash Sehwag, Chen Chen et al.
Decoupled Pseudo-labeling for Semi-Supervised Monocular 3D Object Detection
Jiacheng Zhang, Jiaming Li, Xiangru Lin et al.