Most Cited 2024 "product logics" Papers
12,324 papers found • Page 11 of 62
Conference
CSL: Class-Agnostic Structure-Constrained Learning for Segmentation including the Unseen
Hao Zhang, Fang Li, Lu Qi et al.
A Video is Worth 256 Bases: Spatial-Temporal Expectation-Maximization Inversion for Zero-Shot Video Editing
Li Maomao, Yu Li, Tianyu Yang et al.
Faceptor: A Generalist Model for Face Perception
Lixiong Qin, Mei Wang, Xuannan Liu et al.
Accelerating Image Generation with Sub-path Linear Approximation Model
Chen Xu, Tianhui Song, Weixin Feng et al.
CO2: Efficient Distributed Training with Full Communication-Computation Overlap
Weigao Sun, Qin Zhen, Weixuan Sun et al.
VSFormer: Visual-Spatial Fusion Transformer for Correspondence Pruning
Tangfei Liao, Xiaoqin Zhang, Li Zhao et al.
DiffHuman: Probabilistic Photorealistic 3D Reconstruction of Humans
Akash Sengupta, Thiemo Alldieck, NIKOS KOLOTOUROS et al.
ColorMNet: A Memory-based Deep Spatial-Temporal Feature Propagation Network for Video Colorization
Yixin Yang, Jiangxin Dong, Jinhui Tang et al.
Large-Scale Multi-Hypotheses Cell Tracking Using Ultrametric Contours Maps
Jordao Bragantini, Merlin Lange, Loïc A Royer
Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns
Brian DuSell, David Chiang
Dynamic Retraining-Updating Mean Teacher for Source-Free Object Detection
BA KHANH TRINH LE, Huy-Hung Nguyen, Long Hoang Pham et al.
VEON: Vocabulary-Enhanced Occupancy Prediction
Jilai Zheng, Pin Tang, Zhongdao Wang et al.
Is attention required for ICL? Exploring the Relationship Between Model Architecture and In-Context Learning Ability
Ivan Lee, Nan Jiang, Taylor Berg-Kirkpatrick
ScanFormer: Referring Expression Comprehension by Iteratively Scanning
Wei Su, Peihan Miao, Huanzhang Dou et al.
Efficient Vision-Language Pre-training by Cluster Masking
Zihao Wei, Zixuan Pan, Andrew Owens
One-Shot Structure-Aware Stylized Image Synthesis
Hansam Cho, Jonghyun Lee, Seunggyu Chang et al.
Towards Robust Fidelity for Evaluating Explainability of Graph Neural Networks
Xu Zheng, Farhad Shirani, Tianchun Wang et al.
AdaNAT: Exploring Adaptive Policy for Token-Based Image Generation
Zanlin Ni, Yulin Wang, Renping Zhou et al.
Kandinsky Conformal Prediction: Efficient Calibration of Image Segmentation Algorithms
Joren Brunekreef, Eric Marcus, Ray Sheombarsing et al.
Combating Data Imbalances in Federated Semi-supervised Learning with Dual Regulators
Sikai Bai, Shuaicheng Li, Weiming Zhuang et al.
UniMD: Towards Unifying Moment Retrieval and Temporal Action Detection
Yingsen Zeng, Yujie Zhong, Chengjian Feng et al.
DG-PIC: Domain Generalized Point-In-Context Learning for Point Cloud Understanding
Jincen Jiang, Qianyu Zhou, Yuhang Li et al.
Compositional Generative Inverse Design
Tailin Wu, Takashi Maruyama, Long Wei et al.
Bidirectional Autoregessive Diffusion Model for Dance Generation
Canyu Zhang, Youbao Tang, NING Zhang et al.
Predicated Diffusion: Predicate Logic-Based Attention Guidance for Text-to-Image Diffusion Models
Kota Sueyoshi, Takashi Matsubara
Diffusion Bridges for 3D Point Cloud Denoising
Mathias Vogel, Keisuke Tateno, Marc Pollefeys et al.
The Audio-Visual Conversational Graph: From an Egocentric-Exocentric Perspective
Wenqi Jia, Miao Liu, Hao Jiang et al.
A Simple Latent Diffusion Approach for Panoptic Segmentation and Mask Inpainting
Wouter Van Gansbeke, Bert De Brabandere
Versatile Medical Image Segmentation Learned from Multi-Source Datasets via Model Self-Disambiguation
Xiaoyang Chen, Hao Zheng, Yuemeng LI et al.
Learning Camouflaged Object Detection from Noisy Pseudo Label
Jin Zhang, Ruiheng Zhang, Yanjiao Shi et al.
The Need for Speed: Pruning Transformers with One Recipe
Samir Khaki, Konstantinos Plataniotis
Slice3D: Multi-Slice Occlusion-Revealing Single View 3D Reconstruction
Yizhi Wang, Wallace Lira, Wenqi Wang et al.
Sync-NeRF: Generalizing Dynamic NeRFs to Unsynchronized Videos
Seoha Kim, Jeongmin Bae, Youngsik Yun et al.
DiffSal: Joint Audio and Video Learning for Diffusion Saliency Prediction
Junwen Xiong, Peng Zhang, Tao You et al.
TetraDiffusion: Tetrahedral Diffusion Models for 3D Shape Generation
Nikolai Kalischek, Torben Peters, Jan Dirk Wegner et al.
Pre-training with Random Orthogonal Projection Image Modeling
Maryam Haghighat, Peyman Moghadam, Shaheer Mohamed et al.
Reward-Consistent Dynamics Models are Strongly Generalizable for Offline Reinforcement Learning
Fan-Ming Luo, Tian Xu, Xingchen Cao et al.
Tri-Modal Motion Retrieval by Learning a Joint Embedding Space
Kangning Yin, Shihao Zou, Yuxuan Ge et al.
Complementing Event Streams and RGB Frames for Hand Mesh Reconstruction
Jianping Jiang, xinyu zhou, Bingxuan Wang et al.
NeuSDFusion: A Spatial-Aware Generative Model for 3D Shape Completion, Reconstruction, and Generation
Ruikai Cui, Weizhe Liu, Weixuan Sun et al.
MoVideo: Motion-Aware Video Generation with Diffusion Models
Jingyun Liang, Yuchen Fan, Kai Zhang et al.
Vision-Language Action Knowledge Learning for Semantic-Aware Action Quality Assessment
Huangbiao Xu, Xiao Ke, Yuezhou Li et al.
Bootstrapping Chest CT Image Understanding by Distilling Knowledge from X-ray Expert Models
Weiwei Cao, Jianpeng Zhang, Yingda Xia et al.
CoBIT: A Contrastive Bi-directional Image-Text Generation Model
Haoxuan You, Xiaoyue Guo, Zhecan Wang et al.
Finding Visual Task Vectors
Alberto Hojel, Yutong Bai, Trevor Darrell et al.
Customization Assistant for Text-to-Image Generation
Yufan Zhou, Ruiyi Zhang, Jiuxiang Gu et al.
Improving Text-guided Object Inpainting with Semantic Pre-inpainting
Yifu Chen, Jingwen Chen, Yingwei Pan et al.
Generalizable Task Representation Learning for Offline Meta-Reinforcement Learning with Data Limitations
Renzhe Zhou, Chen-Xiao Gao, Zongzhang Zhang et al.
Non-Exemplar Domain Incremental Learning via Cross-Domain Concept Integration
Qiang Wang, Yuhang He, Songlin Dong et al.
Learning to Learn Better Visual Prompts
Fengxiang Wang, Wanrong Huang, Shaowu Yang et al.
Bridging Different Language Models and Generative Vision Models for Text-to-Image Generation
Shihao Zhao, Shaozhe Hao, Bojia Zi et al.
Event Camera Data Dense Pre-training
Yan Yang, Liyuan Pan, Liu liu
The Effect of Intrinsic Dataset Properties on Generalization: Unraveling Learning Differences Between Natural and Medical Images
Nicholas Konz, Maciej Mazurowski
AMD: Automatic Multi-step Distillation of Large-scale Vision Models
Cheng Han, Qifan Wang, Sohail A Dianat et al.
ProTeCt: Prompt Tuning for Taxonomic Open Set Classification
Tz-Ying Wu, Chih-Hui Ho, Nuno Vasconcelos
MERGE: Fast Private Text Generation
Zi Liang, Pinghui Wang, Ruofei Zhang et al.
Spiking NeRF: Representing the Real-World Geometry by a Discontinuous Representation
Zhanfeng Liao, Yan Liu, Qian Zheng et al.
EntAugment: Entropy-Driven Adaptive Data Augmentation Framework for Image Classification
Suorong Yang, Furao Shen, Jian Zhao
Bridge Past and Future: Overcoming Information Asymmetry in Incremental Object Detection
QIJIE MO, Yipeng Gao, Shenghao Fu et al.
Can We Evaluate Domain Adaptation Models Without Target-Domain Labels?
JIANFEI YANG, Hanjie Qian, Yuecong Xu et al.
Find n' Propagate: Open-Vocabulary 3D Object Detection in Urban Environments
Djamahl Etchegaray, Zi Helen Huang, Tatsuya Harada et al.
MFABA: A More Faithful and Accelerated Boundary-Based Attribution Method for Deep Neural Networks
Zhiyu Zhu, Huaming Chen, Jiayu Zhang et al.
Deep Structural Knowledge Exploitation and Synergy for Estimating Node Importance Value on Heterogeneous Information Networks
Yankai Chen, Yixiang Fang, Qiongyan Wang et al.
RoDUS: Robust Decomposition of Static and Dynamic Elements in Urban Scenes
Thang-Anh-Quan Nguyen, Luis G Roldao Jimenez, Nathan Piasco et al.
BEHAVIOR Vision Suite: Customizable Dataset Generation via Simulation
Yunhao Ge, Yihe Tang, Jiashu Xu et al.
Referring Atomic Video Action Recognition
Kunyu Peng, Jia Fu, Kailun Yang et al.
CAMIL: Context-Aware Multiple Instance Learning for Cancer Detection and Subtyping in Whole Slide Images
olga fourkioti, Matt De Vries, Chris Bakal
SURE: SUrvey REcipes for building reliable and robust deep networks
Yuting Li, Yingyi Chen, Xuanlong Yu et al.
A Good Learner can Teach Better: Teacher-Student Collaborative Knowledge Distillation
Ayan Sengupta, Shantanu Dixit, Md Shad Akhtar et al.
Hyperbolic Learning with Synthetic Captions for Open-World Detection
Fanjie Kong, Yanbei Chen, Jiarui Cai et al.
Binarized Low-light Raw Video Enhancement
Gengchen Zhang, Yulun Zhang, Xin Yuan et al.
Learning Instance-Aware Correspondences for Robust Multi-Instance Point Cloud Registration in Cluttered Scenes
Zhiyuan Yu, Zheng Qin, lintao zheng et al.
LEIA: Latent View-invariant Embeddings for Implicit 3D Articulation
Archana Swaminathan, Anubhav Anubhav, Kamal Gupta et al.
Iterative Ensemble Training with Anti-Gradient Control for Mitigating Memorization in Diffusion Models
Xiao Liu, Xiaoliu Guan, Yu Wu et al.
On the Road to Portability: Compressing End-to-End Motion Planner for Autonomous Driving
Kaituo Feng, Changsheng Li, Dongchun Ren et al.
PIN: Positional Insert Unlocks Object Localisation Abilities in VLMs
Michael Dorkenwald, Nimrod Barazani, Cees G. M. Snoek et al.
3426 Regulating Intermediate 3D Features for Vision-Centric Autonomous Driving
Junkai Xu, Liang Peng, Haoran Cheng et al.
Free-Editor: Zero-shot Text-driven 3D Scene Editing
Md Nazmul Karim, Hasan Iqbal, Umar Khalid et al.
FoSp: Focus and Separation Network for Early Smoke Segmentation
Lujian Yao, Haitao Zhao, Jingchao Peng et al.
FoX: Formation-Aware Exploration in Multi-Agent Reinforcement Learning
Yonghyeon Jo, Sunwoo Lee, Junghyuk Yum et al.
Image Demoireing in RAW and sRGB Domains
Shuning Xu, Binbin Song, Xiangyu Chen et al.
NeRF Analogies: Example-Based Visual Attribute Transfer for NeRFs
Michael Fischer, Zhengqin Li, Thu Nguyen-Phuoc et al.
In2SET: Intra-Inter Similarity Exploiting Transformer for Dual-Camera Compressive Hyperspectral Imaging
Xin Wang, Lizhi Wang, Xiangtian Ma et al.
TASeg: Temporal Aggregation Network for LiDAR Semantic Segmentation
Xiaopei Wu, Yuenan Hou, Xiaoshui Huang et al.
En3D: An Enhanced Generative Model for Sculpting 3D Humans from 2D Synthetic Data
Yifang Men, Biwen Lei, Yuan Yao et al.
Multi-Objective Bayesian Optimization with Active Preference Learning
Ryota Ozaki, Kazuki Ishikawa, Youhei Kanzaki et al.
Online GNN Evaluation Under Test-time Graph Distribution Shifts
Xin Zheng, Dongjin Song, Qingsong Wen et al.
Towards Robust Image Stitching: An Adaptive Resistance Learning against Compatible Attacks
Zhiying Jiang, Xingyuan Li, Jinyuan Liu et al.
Protein Multimer Structure Prediction via Prompt Learning
Ziqi Gao, Xiangguo SUN, Zijing Liu et al.
CNN Kernels Can Be the Best Shapelets
Eric Qu, Yansen Wang, Xufang Luo et al.
ProCC: Progressive Cross-Primitive Compatibility for Open-World Compositional Zero-Shot Learning
Fushuo Huo, Wenchao Xu, Song Guo et al.
Regroup Median Loss for Combating Label Noise
Authors: Fengpeng Li, Kemou Li, Jinyu Tian et al.
Unifying Automatic and Interactive Matting with Pretrained ViTs
Zixuan Ye, Wenze Liu, He Guo et al.
CDPNet: Cross-Modal Dual Phases Network for Point Cloud Completion
Zhenjiang Du, Jiale Dou, Zhitao Liu et al.
ReCoRe: Regularized Contrastive Representation Learning of World Model
Rudra P, K. Poudel, Harit Pandya et al.
CLIFF: Continual Latent Diffusion for Open-Vocabulary Object Detection
Wuyang Li, Xinyu Liu, Jiayi Ma et al.
Reinforcement Learning Meets Visual Odometry
Nico Messikommer, Giovanni Cioffi, Mathias Gehrig et al.
UniCode : Learning a Unified Codebook for Multimodal Large Language Models
Sipeng Zheng, Bohan Zhou, Yicheng Feng et al.
HUMOS: Human Motion Model Conditioned on Body Shape
Shashank Tripathi, Omid Taheri, Christoph Lassner et al.
Retrieval-Guided Reinforcement Learning for Boolean Circuit Minimization
Animesh Basak Chowdhury, Marco Romanelli, Benjamin Tan et al.
Estimating Noisy Class Posterior with Part-level Labels for Noisy Label Learning
Rui Zhao, Bin Shi, Jianfei Ruan et al.
HandBooster: Boosting 3D Hand-Mesh Reconstruction by Conditional Synthesis and Sampling of Hand-Object Interactions
Hao Xu, Li Haipeng, Yinqiao Wang et al.
Neural Volumetric World Models for Autonomous Driving
Zanming Huang, Jimuyang Zhang, Eshed Ohn-Bar
Towards Efficient and Effective Text-to-Video Retrieval with Coarse-to-Fine Visual Representation Learning
Kaibin Tian, Yanhua Cheng, Yi Liu et al.
Robust Distillation via Untargeted and Targeted Intermediate Adversarial Samples
Junhao Dong, Piotr Koniusz, Junxi Chen et al.
Region-Based Representations Revisited
Michal Shlapentokh-Rothman, Ansel Blume, Yao Xiao et al.
A Restoration Network as an Implicit Prior
Yuyang Hu, Mauricio Delbracio, Peyman Milanfar et al.
Long-term Temporal Context Gathering for Neural Video Compression
Linfeng Qi, Zhaoyang Jia, Jiahao Li et al.
Foster Adaptivity and Balance in Learning with Noisy Labels
Mengmeng Sheng, Zeren Sun, Tao Chen et al.
Towards Fair Graph Federated Learning via Incentive Mechanisms
12794 Chenglu Pan, Jiarong Xu, Yue Yu et al.
Free-VSC: Free Semantics from Visual Foundation Models for Unsupervised Video Semantic Compression
Yuan Tian, Guo Lu, Guangtao Zhai
EmoTalk3D: High-Fidelity Free-View Synthesis of Emotional 3D Talking Head
Qianyun He, Xinya Ji, Yicheng Gong et al.
HEAL-SWIN: A Vision Transformer On The Sphere
Oscar Carlsson, Jan E. Gerken, Hampus Linander et al.
GenN2N: Generative NeRF2NeRF Translation
Xiangyue Liu, Han Xue, Kunming Luo et al.
PointNeRF++: A multi-scale, point-based Neural Radiance Field
Weiwei Sun, Eduard Trulls, Yang-Che Tseng et al.
Neural-Symbolic Recursive Machine for Systematic Generalization
Qing Li, Yixin Zhu, Yitao Liang et al.
One Forward is Enough for Neural Network Training via Likelihood Ratio Method
Jinyang Jiang, Zeliang Zhang, Chenliang Xu et al.
Norface: Improving Facial Expression Analysis by Identity Normalization
Hanwei Liu, Rudong An, Zhimeng Zhang et al.
What You See is What You GAN: Rendering Every Pixel for High-Fidelity Geometry in 3D GANs
Alex Trevithick, Matthew Chan, Towaki Takikawa et al.
Adversarial Backdoor Attack by Naturalistic Data Poisoning on Trajectory Prediction in Autonomous Driving
Mozhgan Pourkeshavarz, Mohammad Sabokrou, Amir Rasouli
LaMI-DETR: Open-Vocabulary Detection with Language Model Instruction
Penghui Du, Yu Wang, Yifan Sun et al.
SPADE: Semi-supervised Anomaly Detection under Distribution Mismatch
Chun-Liang Li, Tomas Pfister, Kihyuk Sohn et al.
FineMatch: Aspect-based Fine-grained Image and Text Mismatch Detection and Correction
Hang Hua, Jing Shi, Kushal Kafle et al.
Event-Adapted Video Super-Resolution
Zeyu Xiao, Dachun Kai, Yueyi Zhang et al.
Dynamic Feature Pruning and Consolidation for Occluded Person Re-identification
YuTeng Ye, Hang Zhou, Jiale Cai et al.
Learning Explicit Contact for Implicit Reconstruction of Hand-Held Objects from Monocular Images
Junxing Hu, Hongwen Zhang, Zerui Chen et al.
Reinforcement Learning Friendly Vision-Language Model for Minecraft
Haobin Jiang, Junpeng Yue, Hao Luo et al.
Tri^{2}-plane: Thinking Head Avatar via Feature Pyramid
Luchuan Song, Pinxin Liu, Lele Chen et al.
X-Pose: Detecting Any Keypoints
Jie Yang, AILING ZENG, Ruimao Zhang et al.
LayoutDETR: Detection Transformer Is a Good Multimodal Layout Designer
Ning Yu, Chia-Chih Chen, Zeyuan Chen et al.
Learning Intra-view and Cross-view Geometric Knowledge for Stereo Matching
Rui Gong, Weide Liu, ZAIWANG GU et al.
DiffSCI: Zero-Shot Snapshot Compressive Imaging via Iterative Spectral Diffusion Model
Zhenghao Pan, Haijin Zeng, Jiezhang Cao et al.
HumanRefiner: Benchmarking Abnormal Human Generation and Refining with Coarse-to-fine Pose-Reversible Guidance
Guian Fang, Wenbiao Yan, Yuanfan Guo et al.
M2Doc: A Multi-Modal Fusion Approach for Document Layout Analysis
Ning Zhang, Hiuyi Cheng, Jiayu Chen et al.
Accelerating Data Generation for Neural Operators via Krylov Subspace Recycling
Hong Wang, Zhongkai Hao, Jie Wang et al.
AutoCast++: Enhancing World Event Prediction with Zero-shot Ranking-based Context Retrieval
Qi Yan, Raihan Seraj, Jiawei He et al.
Exploiting Auxiliary Caption for Video Grounding
Hongxiang Li, Meng Cao, Xuxin Cheng et al.
HONGAT: Graph Attention Networks in the Presence of High-Order Neighbors
Heng-Kai Zhang, Yi-Ge Zhang, Zhi Zhou et al.
Temporal Event Stereo via Joint Learning with Stereoscopic Flow
Hoonhee Cho, Jae-young Kang, Kuk-Jin Yoon
GRIDS: Grouped Multiple-Degradation Restoration with Image Degradation Similarity
Shuo Cao, Yihao Liu, Wenlong Zhang et al.
Rating-Based Reinforcement Learning
Devin White, Mingkang Wu, Ellen Novoseller et al.
Differentiable Euler Characteristic Transforms for Shape Classification
Ernst Roell, Bastian Rieck
Brain-ID: Learning Contrast-agnostic Anatomical Representations for Brain Imaging
Peirong Liu, Oula Puonti, Xiaoling Hu et al.
Compound Text-Guided Prompt Tuning via Image-Adaptive Cues
Hao Tan, Jun Li, Yizhuang Zhou et al.
ScanTalk: 3D Talking Heads from Unregistered Scans
Federico Nocentini, Thomas Besnier, Claudio Ferrari et al.
UMG-CLIP: A Unified Multi-Granularity Vision Generalist for Open-World Understanding
Bowen Shi, Peisen Zhao, Zichen Wang et al.
3DGazeNet: Generalizing Gaze Estimation with Weak Supervision from Synthetic Views
Evangelos Ververas, Polydefkis Gkagkos, Jiankang Deng et al.
Kalman-Inspired Feature Propagation for Video Face Super-Resolution
Ruicheng Feng, Chongyi Li, Chen Change Loy
Local Action-Guided Motion Diffusion Model for Text-to-Motion Generation
Peng Jin, Hao Li, Zesen Cheng et al.
Rephrase, Augment, Reason: Visual Grounding of Questions for Vision-Language Models
Archiki Prasad, Elias Stengel-Eskin, Mohit Bansal
DV-3DLane: End-to-end Multi-modal 3D Lane Detection with Dual-view Representation
Yueru Luo, Shuguang Cui, Zhen Li
ScatterFormer: Efficient Voxel Transformer with Scattered Linear Attention
Chenhang He, Ruihuang Li, Guowen Zhang et al.
InstructGIE: Towards Generalizable Image Editing
Zichong Meng, Changdi Yang, Jun Liu et al.
Harnessing Text-to-Image Diffusion Models for Category-Agnostic Pose Estimation
Duo Peng, Zhengbo Zhang, Ping Hu et al.
Empowering Embodied Visual Tracking with Visual Foundation Models and Offline RL
Fangwei Zhong, Kui Wu, Hai Ci et al.
Fed-QSSL: A Framework for Personalized Federated Learning under Bitwidth and Data Heterogeneity
Yiyue Chen, Haris Vikalo, Chianing Wang
GDA: Generalized Diffusion for Robust Test-time Adaptation
Yun-Yun Tsai, Fu-Chen Chen, Albert Chen et al.
Distilling Reliable Knowledge for Instance-Dependent Partial Label Learning
Dong-Dong Wu, Deng-Bao Wang, Min-Ling Zhang
USE: Universal Segment Embeddings for Open-Vocabulary Image Segmentation
Xiaoqi Wang, Wenbin He, Xiwei Xuan et al.
Boosting 3D Single Object Tracking with 2D Matching Distillation and 3D Pre-training
qiangqiang wu, Yan Xia, Jia Wan et al.
STAMP: Outlier-Aware Test-Time Adaptation with Stable Memory Replay
Yu Yongcan, Lijun Sheng, Ran He et al.
ISP-Teacher: Image Signal Process with Disentanglement Regularization for Unsupervised Domain Adaptive Dark Object Detection
Yin Zhang, Yongqiang Zhang, Zian Zhang et al.
Lift3D: Zero-Shot Lifting of Any 2D Vision Model to 3D
Mukund Varma T, Peihao Wang, Zhiwen Fan et al.
DiffuMatting: Synthesizing Arbitrary Objects with Matting-level Annotation
Xiaobin Hu, Xu Peng, Donghao Luo et al.
Full Bayesian Significance Testing via Neural Networks
Zehua Liu, Zimeng Li, Jingyuan Wang et al.
Semi-supervised Segmentation of Histopathology Images with Noise-Aware Topological Consistency
Meilong Xu, Xiaoling Hu, Saumya Gupta et al.
DreamMover: Leveraging the Prior of Diffusion Models for Image Interpolation with Large Motion
Liao Shen, Tianqi Liu, Huiqiang Sun et al.
CardiacNet: Learning to Reconstruct Abnormalities for Cardiac Disease Assessment from Echocardiogram Videos
JIEWEN YANG, Yiqun Lin, Bin Pu et al.
Light Schrödinger Bridge
Alexander Korotin, Nikita Gushchin, Evgeny Burnaev
RoDLA: Benchmarking the Robustness of Document Layout Analysis Models
Yufan Chen, Jiaming Zhang, Kunyu Peng et al.
Improving Geo-diversity of Generated Images with Contextualized Vendi Score Guidance
Reyhane Askari Hemmat, Melissa Hall, Alicia Yi Sun et al.
MultiDelete for Multimodal Machine Unlearning
Jiali Cheng, Hadi Amiri
Federated Causality Learning with Explainable Adaptive Optimization
Dezhi Yang, Xintong He, Jun Wang et al.
Scribble Hides Class: Promoting Scribble-Based Weakly-Supervised Semantic Segmentation
Xinliang Zhang, Lei Zhu, Hangzhou He et al.
Rethinking the Paradigm of Content Constraints in Unpaired Image-to-Image Translation
Xiuding Cai, Yaoyao Zhu, Dong Miao et al.
FashionERN: Enhance-and-Refine Network for Composed Fashion Image Retrieval
Yanzhe Chen, Huasong Zhong, Xiangteng He et al.
MCPNet: An Interpretable Classifier via Multi-Level Concept Prototypes
Bor Shiun Wang, Chien-Yi Wang, Wei-Chen Chiu
UniHuman: A Unified Model For Editing Human Images in the Wild
Nannan Li, Qing Liu, Krishna Kumar Singh et al.
Adaptive Compressed Sensing with Diffusion-Based Posterior Sampling
Noam Elata, Tomer Michaeli, Michael Elad
Multi-Scene Generalized Trajectory Global Graph Solver with Composite Nodes for Multiple Object Tracking
Yan Gao, Haojun Xu, Jie Li et al.
Dual-Window Multiscale Transformer for Hyperspectral Snapshot Compressive Imaging
Fulin Luo, Xi Chen, Xiuwen Gong et al.
OTSeg: Multi-prompt Sinkhorn Attention for Zero-Shot Semantic Segmentation
Kwanyoung Kim, Yujin Oh, Jong Chul Ye
Partial-to-Partial Shape Matching with Geometric Consistency
Viktoria Ehm, Maolin Gao, Paul Roetzer et al.
BENO: Boundary-embedded Neural Operators for Elliptic PDEs
Haixin Wang, Jiaxin Li, Anubhav Dwivedi et al.
Symmetric Mean-field Langevin Dynamics for Distributional Minimax Problems
Juno Kim, Kakei Yamamoto, Kazusato Oko et al.
Unsupervised Cross-Domain Image Retrieval via Prototypical Optimal Transport
Bin Li, Ye Shi, Qian Yu et al.
Robust Multimodal Learning via Representation Decoupling
Shicai Wei, Yang Luo, Yuji Wang et al.
3D Multi-frame Fusion for Video Stabilization
Zhan Peng, Xinyi Ye, Weiyue Zhao et al.
Hebbian Learning based Orthogonal Projection for Continual Learning of Spiking Neural Networks
Mingqing Xiao, Qingyan Meng, Zongpeng Zhang et al.
Single-View Scene Point Cloud Human Grasp Generation
Yan-Kang Wang, Chengyi Xing, Yi-Lin Wei et al.
Cycle-Consistency Learning for Captioning and Grounding
Ning Wang, Jiajun Deng, Mingbo Jia
Self-Guided Generation of Minority Samples Using Diffusion Models
Soobin Um, Jong Chul Ye
Task-Driven Causal Feature Distillation: Towards Trustworthy Risk Prediction
Zhixuan Chu, Mengxuan Hu, Qing Cui et al.
Fine-Grained Knowledge Selection and Restoration for Non-exemplar Class Incremental Learning
Authors: Jiang-Tian Zhai, Xialei Liu, Lu Yu et al.
3D Open-Vocabulary Panoptic Segmentation with 2D-3D Vision-Language Distillation
Zihao Xiao, Longlong Jing, Shangxuan Wu et al.
Move Anything with Layered Scene Diffusion
Jiawei Ren, Mengmeng Xu, Jui-Chieh Wu et al.
ReALFRED: An Embodied Instruction Following Benchmark in Photo-Realistic Environments
Taewoong Kim, Cheolhong Min, Byeonghwi Kim et al.
SPD-DDPM: Denoising Diffusion Probabilistic Models in the Symmetric Positive Definite Space
Yunchen Li, Zhou Yu, Gaoqi He et al.
IMMA: Immunizing text-to-image Models against Malicious Adaptation
Amber Yijia Zheng, Raymond Yeh
How to Handle Sketch-Abstraction in Sketch-Based Image Retrieval?
Subhadeep Koley, Ayan Kumar Bhunia, Aneeshan Sain et al.