Most Cited 2024 "boltzmann influence functions" Papers
12,324 papers found • Page 11 of 62
Conference
Revealing the Proximate Long-Tail Distribution in Compositional Zero-Shot Learning
Chenyi Jiang, Haofeng Zhang
Decentralized Riemannian Conjugate Gradient Method on the Stiefel Manifold
Jun Chen, Haishan Ye, Mengmeng Wang et al.
TCLC-GS: Tightly Coupled LiDAR-Camera Gaussian Splatting for Autonomous Driving
Cheng Zhao, su sun, Ruoyu Wang et al.
CADTalk: An Algorithm and Benchmark for Semantic Commenting of CAD Programs
Haocheng Yuan, Jing Xu, Hao Pan et al.
CIFAR-10-Warehouse: Broad and More Realistic Testbeds in Model Generalization Analysis
Xiaoxiao Sun, Xingjian Leng, Zijian Wang et al.
Osmosis: RGBD Diffusion Prior for Underwater Image Restoration
Opher Bar Nathan, Deborah Steinberger-Levy, Tali Treibitz et al.
CORN: Contact-based Object Representation for Nonprehensile Manipulation of General Unseen Objects
Yoonyoung Cho, Junhyek Han, Yoontae Cho et al.
TransCAD: A Hierarchical Transformer for CAD Sequence Inference from Point Clouds
Dupont Elona, Kseniya Cherenkova, Dimitrios Mallis et al.
Mirage: Model-agnostic Graph Distillation for Graph Classification
Mridul Gupta, Sahil Manchanda, HARIPRASAD KODAMANA et al.
Transformer-Based Selective Super-resolution for Efficient Image Refinement
Tianyi Zhang, Kishore Kasichainula, Yaoxin Zhuo et al.
Visual Concept Connectome (VCC): Open World Concept Discovery and their Interlayer Connections in Deep Models
Matthew Kowal, Richard P. Wildes, Kosta Derpanis
MSD: A Benchmark Dataset for Floor Plan Generation of Building Complexes
Casper van Engelenburg, Fatemeh Mostafavi, Emanuel Kuhn et al.
Seeking Neural Nuggets: Knowledge Transfer in Large Language Models from a Parametric Perspective
Ming Zhong, Chenxin An, Weizhu Chen et al.
Semi-supervised Active Learning for Video Action Detection
Ayush Singh, Aayush J Rana, Akash Kumar et al.
Grounded Object-Centric Learning
Avinash Kori, Francesco Locatello, Fabio De Sousa Ribeiro et al.
Learning Hierarchical Image Segmentation For Recognition and By Recognition
Tsung-Wei Ke, Sangwoo Mo, Stella Yu
Unleashing the Potential of Fractional Calculus in Graph Neural Networks with FROND
Qiyu Kang, Kai Zhao, Qinxu Ding et al.
Align Before Adapt: Leveraging Entity-to-Region Alignments for Generalizable Video Action Recognition
Yifei Chen, Dapeng Chen, Ruijin Liu et al.
SegGen: Supercharging Segmentation Models with Text2Mask and Mask2Img Synthesis
Hanrong Ye, Jason Wen Yong Kuen, Qing Liu et al.
C^2RV: Cross-Regional and Cross-View Learning for Sparse-View CBCT Reconstruction
Yiqun Lin, Jiewen Yang, hualiang wang et al.
Guess The Unseen: Dynamic 3D Scene Reconstruction from Partial 2D Glimpses
Inhee Lee, Byungjun Kim, Hanbyul Joo
Iterated Learning Improves Compositionality in Large Vision-Language Models
Chenhao Zheng, Jieyu Zhang, Aniruddha Kembhavi et al.
Codebook Transfer with Part-of-Speech for Vector-Quantized Image Modeling
Baoquan Zhang, Huaibin Wang, Luo Chuyao et al.
Reinforcement Learning Friendly Vision-Language Model for Minecraft
Haobin Jiang, Junpeng Yue, Hao Luo et al.
Tri^{2}-plane: Thinking Head Avatar via Feature Pyramid
Luchuan Song, Pinxin Liu, Lele Chen et al.
Cyclic Learning for Binaural Audio Generation and Localization
Zhaojian Li, Bin Zhao, Yuan Yuan
ColorMNet: A Memory-based Deep Spatial-Temporal Feature Propagation Network for Video Colorization
Yixin Yang, Jiangxin Dong, Jinhui Tang et al.
CO2: Efficient Distributed Training with Full Communication-Computation Overlap
Weigao Sun, Qin Zhen, Weixuan Sun et al.
SAFNet: Selective Alignment Fusion Network for Efficient HDR Imaging
Lingtong Kong, Bo Li, Yike Xiong et al.
What How and When Should Object Detectors Update in Continually Changing Test Domains?
Jayeon Yoo, Dongkwan Lee, Inseop Chung et al.
Gaussian Shadow Casting for Neural Characters
Luis Bolanos, Shih-Yang Su, Helge Rhodin
Multiple View Geometry Transformers for 3D Human Pose Estimation
Ziwei Liao, jialiang zhu, Chunyu Wang et al.
History Matters: Temporal Knowledge Editing in Large Language Model
Xunjian Yin, Jin Jiang, Liming Yang et al.
Learning Camouflaged Object Detection from Noisy Pseudo Label
Jin Zhang, Ruiheng Zhang, Yanjiao Shi et al.
Adversarial Score Distillation: When score distillation meets GAN
Min Wei, Jingkai Zhou, Junyao Sun et al.
GenesisTex: Adapting Image Denoising Diffusion to Texture Space
Chenjian Gao, Boyan Jiang, Xinghui Li et al.
Signed Graph Neural Ordinary Differential Equation for Modeling Continuous-Time Dynamics
Lanlan Chen, Kai Wu, Jian Lou et al.
Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns
Brian DuSell, David Chiang
Binarized Low-light Raw Video Enhancement
Gengchen Zhang, Yulun Zhang, Xin Yuan et al.
VEON: Vocabulary-Enhanced Occupancy Prediction
Jilai Zheng, Pin Tang, Zhongdao Wang et al.
A Simple Latent Diffusion Approach for Panoptic Segmentation and Mask Inpainting
Wouter Van Gansbeke, Bert De Brabandere
Accelerating Image Generation with Sub-path Linear Approximation Model
Chen Xu, Tianhui Song, Weixin Feng et al.
MeshSegmenter: Zero-Shot Mesh Segmentation via Texture Synthesis
ziming zhong, Yanyu Xu, Jing Li et al.
Towards Robust Fidelity for Evaluating Explainability of Graph Neural Networks
Xu Zheng, Farhad Shirani, Tianchun Wang et al.
PCE-Palm: Palm Crease Energy Based Two-Stage Realistic Pseudo-Palmprint Generation
Lei Shen, Jianlong Jin, Ruixin Zhang et al.
CSL: Class-Agnostic Structure-Constrained Learning for Segmentation including the Unseen
Hao Zhang, Fang Li, Lu Qi et al.
Progressive Divide-and-Conquer via Subsampling Decomposition for Accelerated MRI
Chong Wang, Lanqing Guo, Yufei Wang et al.
OmniMotionGPT: Animal Motion Generation with Limited Data
Zhangsihao Yang, Mingyuan Zhou, Mengyi Shan et al.
BaCon: Boosting Imbalanced Semi-supervised Learning via Balanced Feature-Level Contrastive Learning
Qianhan Feng, Lujing Xie, Shijie Fang et al.
VSFormer: Visual-Spatial Fusion Transformer for Correspondence Pruning
Tangfei Liao, Xiaoqin Zhang, Li Zhao et al.
The First to Know: How Token Distributions Reveal Hidden Knowledge in Large Vision-Language Models?
Qinyu Zhao, Ming Xu, Kartik Gupta et al.
Compositional Generative Inverse Design
Tailin Wu, Takashi Maruyama, Long Wei et al.
GenN2N: Generative NeRF2NeRF Translation
Xiangyue Liu, Han Xue, Kunming Luo et al.
LookupViT: Compressing visual information to a limited number of tokens
Rajat Koner, Gagan Jain, Sujoy Paul et al.
ConTex-Human: Free-View Rendering of Human from a Single Image with Texture-Consistent Synthesis
Xiangjun Gao, Xiaoyu Li, Chaopeng Zhang et al.
Bridging the Synthetic-to-Authentic Gap: Distortion-Guided Unsupervised Domain Adaptation for Blind Image Quality Assessment
Aobo Li, Jinjian Wu, Yongxu Liu et al.
LaPose: Laplacian Mixture Shape Modeling for RGB-Based Category-Level Object Pose Estimation
Ruida Zhang, Ziqin Huang, Gu Wang et al.
Enhancing Vision-Language Pre-training with Rich Supervisions
Yuan Gao, Kunyu Shi, Pengkai Zhu et al.
DiffHuman: Probabilistic Photorealistic 3D Reconstruction of Humans
Akash Sengupta, Thiemo Alldieck, NIKOS KOLOTOUROS et al.
ScanFormer: Referring Expression Comprehension by Iteratively Scanning
Wei Su, Peihan Miao, Huanzhang Dou et al.
One-Shot Structure-Aware Stylized Image Synthesis
Hansam Cho, Jonghyun Lee, Seunggyu Chang et al.
Living Scenes: Multi-object Relocalization and Reconstruction in Changing 3D Environments
Liyuan Zhu, Shengyu Huang, Konrad Schindler et al.
Tensorized Label Learning on Anchor Graph
Jing Li, Quanxue Gao, Qianqian Wang et al.
Enhancing Source-Free Domain Adaptive Object Detection with Low-confidence Pseudo Label Distillation
Ilhoon Yoon, Hyeongjun Kwon, Jin Kim et al.
Kill Two Birds with One Stone: Rethinking Data Augmentation for Deep Long-tailed Learning
Binwu Wang, Pengkun Wang, Wei Xu et al.
NeRF Director: Revisiting View Selection in Neural Volume Rendering
Wenhui Xiao, Rodrigo Santa Cruz, David Ahmedt-Aristizabal et al.
DG-PIC: Domain Generalized Point-In-Context Learning for Point Cloud Understanding
Jincen Jiang, Qianyu Zhou, Yuhang Li et al.
UniMD: Towards Unifying Moment Retrieval and Temporal Action Detection
Yingsen Zeng, Yujie Zhong, Chengjian Feng et al.
CardiacNet: Learning to Reconstruct Abnormalities for Cardiac Disease Assessment from Echocardiogram Videos
JIEWEN YANG, Yiqun Lin, Bin Pu et al.
Animal Avatars: Reconstructing Animatable 3D Animals from Casual Videos
Remy Sabathier, David Novotny, Niloy Mitra
Adversarial Training Should Be Cast as a Non-Zero-Sum Game
Alex Robey, Fabian Latorre, George Pappas et al.
Learning Optimal Advantage from Preferences and Mistaking It for Reward
W Bradley Knox, Stephane Hatgis-Kessell, Sigurdur Orn Adalgeirsson et al.
Get an A in Math: Progressive Rectification Prompting
Zhenyu Wu, Meng Jiang, Chao Shen
Morphable Diffusion: 3D-Consistent Diffusion for Single-image Avatar Creation
Xiyi Chen, Marko Mihajlovic, Shaofei Wang et al.
OVER-NAV: Elevating Iterative Vision-and-Language Navigation with Open-Vocabulary Detection and StructurEd Representation
Ganlong Zhao, Guanbin Li, Weikai Chen et al.
Multichannel AV-wav2vec2: A Framework for Learning Multichannel Multi-Modal Speech Representation
Qiushi Zhu, Jie Zhang, Yu Gu et al.
Hybrid Proposal Refiner: Revisiting DETR Series from the Faster R-CNN Perspective
Jinjing Zhao, Fangyun Wei, Chang Xu
Table of Contents
Pengfei Hu, Zhenrong Zhang, Jianshu Zhang et al.
Quad Bayer Joint Demosaicing and Denoising Based on Dual Encoder Network with Joint Residual Learning
Bolun Zheng, Li Haoran, Quan Chen et al.
Diffusion Bridges for 3D Point Cloud Denoising
Mathias Vogel, Keisuke Tateno, Marc Pollefeys et al.
Negative Pre-aware for Noisy Cross-Modal Matching
Xu Zhang, Hao Li, Mang Ye
FRDiff : Feature Reuse for Universal Training-free Acceleration of Diffusion Models
Junhyuk So, Jungwon Lee, Eunhyeok Park
Tri-Modal Motion Retrieval by Learning a Joint Embedding Space
Kangning Yin, Shihao Zou, Yuxuan Ge et al.
TeMO: Towards Text-Driven 3D Stylization for Multi-Object Meshes
Xuying Zhang, Bo-Wen Yin, yuming chen et al.
Single-to-Dual-View Adaptation for Egocentric 3D Hand Pose Estimation
Ruicong Liu, Takehiko Ohkawa, Mingfang Zhang et al.
Kandinsky Conformal Prediction: Efficient Calibration of Image Segmentation Algorithms
Joren Brunekreef, Eric Marcus, Ray Sheombarsing et al.
Learning MDL Logic Programs from Noisy Data
Céline Hocquette, Andreas Niskanen, Matti Järvisalo et al.
Learning to Visually Localize Sound Sources from Mixtures without Prior Source Knowledge
Dongjin Kim, Sung Jin Um, Sangmin Lee et al.
Improving Spectral Snapshot Reconstruction with Spectral-Spatial Rectification
Jiancheng Zhang, Haijin Zeng, Yongyong Chen et al.
HiLo: Detailed and Robust 3D Clothed Human Reconstruction with High-and Low-Frequency Information of Parametric Models
Yifan Yang, Dong Liu, Shuhai Zhang et al.
Make Me a BNN: A Simple Strategy for Estimating Bayesian Uncertainty from Pre-trained Models
Gianni Franchi, Olivier Laurent, Maxence Leguéry et al.
A Noisy Elephant in the Room: Is Your Out-of-Distribution Detector Robust to Label Noise?
Galadrielle Humblot-Renaux, Sergio Escalera, Thomas B. Moeslund
A Simple Background Augmentation Method for Object Detection with Diffusion Model
YUHANG LI, Xin Dong, Chen Chen et al.
Multimarginal Generative Modeling with Stochastic Interpolants
Michael Albergo, Nicholas Boffi, Michael Lindsey et al.
Efficient Vision-Language Pre-training by Cluster Masking
Zihao Wei, Zixuan Pan, Andrew Owens
Semi-supervised Open-World Object Detection
Sahal Shaji Mullappilly, Abhishek Singh Gehlot, Rao Muhammad Anwer et al.
Open Panoramic Segmentation
Junwei Zheng, Ruiping Liu, Yufan Chen et al.
ColNeRF: Collaboration for Generalizable Sparse Input Neural Radiance Field
Zhangkai Ni, Peiqi Yang, Wenhan Yang et al.
Diffusion in Diffusion: Cyclic One-Way Diffusion for Text-Vision-Conditioned Generation
Ruoyu Wang, Yongqi Yang, Zhihao Qian et al.
AugUndo: Scaling Up Augmentations for Monocular Depth Completion and Estimation
Yangchao Wu, Tian Yu Liu, Hyoungseob Park et al.
SOAC: Spatio-Temporal Overlap-Aware Multi-Sensor Calibration using Neural Radiance Fields
Quentin HERAU, Nathan Piasco, Moussab Bennehar et al.
Are Human-generated Demonstrations Necessary for In-context Learning?
Rui Li, Guoyin Wang, Jiwei Li
Hypergraph Joint Representation Learning for Hypervertices and Hyperedges via Cross Expansion
Yuguang Yan, Yuanlin Chen, Shibo Wang et al.
Versatile Medical Image Segmentation Learned from Multi-Source Datasets via Model Self-Disambiguation
Xiaoyang Chen, Hao Zheng, Yuemeng LI et al.
Bidirectional Autoregessive Diffusion Model for Dance Generation
Canyu Zhang, Youbao Tang, NING Zhang et al.
Predicated Diffusion: Predicate Logic-Based Attention Guidance for Text-to-Image Diffusion Models
Kota Sueyoshi, Takashi Matsubara
Instant 3D Human Avatar Generation using Image Diffusion Models
Nikos Kolotouros, Thiemo Alldieck, Enric Corona et al.
PixOOD: Pixel-Level Out-of-Distribution Detection
Tomas Vojir, Jan Sochman, Jiri Matas
Dissecting Sample Hardness: A Fine-Grained Analysis of Hardness Characterization Methods for Data-Centric AI
Nabeel Seedat, Fergus Imrie, Mihaela van der Schaar
Unveiling Typographic Deceptions: Insights of the Typographic Vulnerability in Large Vision-Language Models
Hao Cheng, Erjia Xiao, Jindong Gu et al.
Complementing Event Streams and RGB Frames for Hand Mesh Reconstruction
Jianping Jiang, xinyu zhou, Bingxuan Wang et al.
Instance-Aware Group Quantization for Vision Transformers
Jaehyeon Moon, Dohyung Kim, Jun Yong Cheon et al.
The Audio-Visual Conversational Graph: From an Egocentric-Exocentric Perspective
Wenqi Jia, Miao Liu, Hao Jiang et al.
Adapters Strike Back
Jan-Martin Steitz, Stefan Roth
ConsistDreamer: 3D-Consistent 2D Diffusion for High-Fidelity Scene Editing
Jun-Kun Chen, Samuel Rota Bulò, Norman Müller et al.
CREAD: A Classification-Restoration Framework with Error Adaptive Discretization for Watch Time Prediction in Video Recommender Systems
Jie Sun, Zhao Ying Ding, Xiaoshuang Chen et al.
AdaNAT: Exploring Adaptive Policy for Token-Based Image Generation
Zanlin Ni, Yulin Wang, Renping Zhou et al.
Dynamic Retraining-Updating Mean Teacher for Source-Free Object Detection
BA KHANH TRINH LE, Huy-Hung Nguyen, Long Hoang Pham et al.
Large-Scale Multi-Hypotheses Cell Tracking Using Ultrametric Contours Maps
Jordao Bragantini, Merlin Lange, Loïc A Royer
Faceptor: A Generalist Model for Face Perception
Lixiong Qin, Mei Wang, Xuannan Liu et al.
Aligner$^2$: Enhancing Joint Multiple Intent Detection and Slot Filling via Adjustive and Forced Cross-Task Alignment
Zhihong Zhu, Xuxin Cheng, Yaowei Li et al.
A Video is Worth 256 Bases: Spatial-Temporal Expectation-Maximization Inversion for Zero-Shot Video Editing
Li Maomao, Yu Li, Tianyu Yang et al.
Is attention required for ICL? Exploring the Relationship Between Model Architecture and In-Context Learning Ability
Ivan Lee, Nan Jiang, Taylor Berg-Kirkpatrick
Rate-Distortion-Cognition Controllable Versatile Neural Image Compression
Jinming Liu, Ruoyu Feng, Yunpeng Qi et al.
Event-Adapted Video Super-Resolution
Zeyu Xiao, Dachun Kai, Yueyi Zhang et al.
Multi-Objective Bayesian Optimization with Active Preference Learning
Ryota Ozaki, Kazuki Ishikawa, Youhei Kanzaki et al.
One Forward is Enough for Neural Network Training via Likelihood Ratio Method
Jinyang Jiang, Zeliang Zhang, Chenliang Xu et al.
MFABA: A More Faithful and Accelerated Boundary-Based Attribution Method for Deep Neural Networks
Zhiyu Zhu, Huaming Chen, Jiayu Zhang et al.
CLIFF: Continual Latent Diffusion for Open-Vocabulary Object Detection
Wuyang Li, Xinyu Liu, Jiayi Ma et al.
SPADE: Semi-supervised Anomaly Detection under Distribution Mismatch
Chun-Liang Li, Tomas Pfister, Kihyuk Sohn et al.
Cycle-Consistency Learning for Captioning and Grounding
Ning Wang, Jiajun Deng, Mingbo Jia
EvSign: Sign Language Recognition and Translation with Streaming Events
Pengyu Zhang, Hao Yin, Zeren Wang et al.
The Effect of Intrinsic Dataset Properties on Generalization: Unraveling Learning Differences Between Natural and Medical Images
Nicholas Konz, Maciej Mazurowski
ProCC: Progressive Cross-Primitive Compatibility for Open-World Compositional Zero-Shot Learning
Fushuo Huo, Wenchao Xu, Song Guo et al.
LaMI-DETR: Open-Vocabulary Detection with Language Model Instruction
Penghui Du, Yu Wang, Yifan Sun et al.
En3D: An Enhanced Generative Model for Sculpting 3D Humans from 2D Synthetic Data
Yifang Men, Biwen Lei, Yuan Yao et al.
BEHAVIOR Vision Suite: Customizable Dataset Generation via Simulation
Yunhao Ge, Yihe Tang, Jiashu Xu et al.
Bootstrapping Chest CT Image Understanding by Distilling Knowledge from X-ray Expert Models
Weiwei Cao, Jianpeng Zhang, Yingda Xia et al.
Unifying Automatic and Interactive Matting with Pretrained ViTs
Zixuan Ye, Wenze Liu, He Guo et al.
NeuSDFusion: A Spatial-Aware Generative Model for 3D Shape Completion, Reconstruction, and Generation
Ruikai Cui, Weizhe Liu, Weixuan Sun et al.
Sequential Fusion Based Multi-Granularity Consistency for Space-Time Transformer Tracking
Kun Hu, Wenjing Yang, Wanrong Huang et al.
TetraDiffusion: Tetrahedral Diffusion Models for 3D Shape Generation
Nikolai Kalischek, Torben Peters, Jan Dirk Wegner et al.
Reward-Consistent Dynamics Models are Strongly Generalizable for Offline Reinforcement Learning
Fan-Ming Luo, Tian Xu, Xingchen Cao et al.
Non-Exemplar Domain Incremental Learning via Cross-Domain Concept Integration
Qiang Wang, Yuhang He, Songlin Dong et al.
FoX: Formation-Aware Exploration in Multi-Agent Reinforcement Learning
Yonghyeon Jo, Sunwoo Lee, Junghyuk Yum et al.
CNN Kernels Can Be the Best Shapelets
Eric Qu, Yansen Wang, Xufang Luo et al.
HEAL-SWIN: A Vision Transformer On The Sphere
Oscar Carlsson, Jan E. Gerken, Hampus Linander et al.
Learning Explicit Contact for Implicit Reconstruction of Hand-Held Objects from Monocular Images
Junxing Hu, Hongwen Zhang, Zerui Chen et al.
On the Road to Portability: Compressing End-to-End Motion Planner for Autonomous Driving
Kaituo Feng, Changsheng Li, Dongchun Ren et al.
Hyperbolic Learning with Synthetic Captions for Open-World Detection
Fanjie Kong, Yanbei Chen, Jiarui Cai et al.
Towards Fair Graph Federated Learning via Incentive Mechanisms
12794 Chenglu Pan, Jiarong Xu, Yue Yu et al.
GRIDS: Grouped Multiple-Degradation Restoration with Image Degradation Similarity
Shuo Cao, Yihao Liu, Wenlong Zhang et al.
Improving Text-guided Object Inpainting with Semantic Pre-inpainting
Yifu Chen, Jingwen Chen, Yingwei Pan et al.
CAMIL: Context-Aware Multiple Instance Learning for Cancer Detection and Subtyping in Whole Slide Images
olga fourkioti, Matt De Vries, Chris Bakal
HumanRefiner: Benchmarking Abnormal Human Generation and Refining with Coarse-to-fine Pose-Reversible Guidance
Guian Fang, Wenbiao Yan, Yuanfan Guo et al.
Pre-training with Random Orthogonal Projection Image Modeling
Maryam Haghighat, Peyman Moghadam, Shaheer Mohamed et al.
Customization Assistant for Text-to-Image Generation
Yufan Zhou, Ruiyi Zhang, Jiuxiang Gu et al.
DailyDVS-200: A Comprehensive Benchmark Dataset for Event-Based Action Recognition
Qi Wang, Zhou Xu, Yuming Lin et al.
SyFormer: Structure-Guided Synergism Transformer for Large-Portion Image Inpainting
Jie Wu, Yuchao Feng, Honghui Xu et al.
Can We Evaluate Domain Adaptation Models Without Target-Domain Labels?
JIANFEI YANG, Hanjie Qian, Yuecong Xu et al.
Dynamic Feature Pruning and Consolidation for Occluded Person Re-identification
YuTeng Ye, Hang Zhou, Jiale Cai et al.
MERGE: Fast Private Text Generation
Zi Liang, Pinghui Wang, Ruofei Zhang et al.
Bridging Different Language Models and Generative Vision Models for Text-to-Image Generation
Shihao Zhao, Shaozhe Hao, Bojia Zi et al.
X-Pose: Detecting Any Keypoints
Jie Yang, AILING ZENG, Ruimao Zhang et al.
Free-Editor: Zero-shot Text-driven 3D Scene Editing
Md Nazmul Karim, Hasan Iqbal, Umar Khalid et al.
In2SET: Intra-Inter Similarity Exploiting Transformer for Dual-Camera Compressive Hyperspectral Imaging
Xin Wang, Lizhi Wang, Xiangtian Ma et al.
DexFuncGrasp: A Robotic Dexterous Functional Grasp Dataset Constructed from a Cost-Effective Real-Simulation Annotation System
Jinglue Hang, Xiangbo Lin, Tianqiang Zhu et al.
What You See is What You GAN: Rendering Every Pixel for High-Fidelity Geometry in 3D GANs
Alex Trevithick, Matthew Chan, Towaki Takikawa et al.
HONGAT: Graph Attention Networks in the Presence of High-Order Neighbors
Heng-Kai Zhang, Yi-Ge Zhang, Zhi Zhou et al.
Exploiting Auxiliary Caption for Video Grounding
Hongxiang Li, Meng Cao, Xuxin Cheng et al.
Generative 3D Part Assembly via Part-Whole-Hierarchy Message Passing
Bi'an Du, Xiang Gao, Wei Hu et al.
M2Doc: A Multi-Modal Fusion Approach for Document Layout Analysis
Ning Zhang, Hiuyi Cheng, Jiayu Chen et al.
AttnZero: Efficient Attention Discovery for Vision Transformers
Lujun Li, Zimian Wei, Peijie Dong et al.
Referring Atomic Video Action Recognition
Kunyu Peng, Jia Fu, Kailun Yang et al.
MoVideo: Motion-Aware Video Generation with Diffusion Models
Jingyun Liang, Yuchen Fan, Kai Zhang et al.
PIN: Positional Insert Unlocks Object Localisation Abilities in VLMs
Michael Dorkenwald, Nimrod Barazani, Cees G. M. Snoek et al.
Event Camera Data Dense Pre-training
Yan Yang, Liyuan Pan, Liu liu
Finding Visual Task Vectors
Alberto Hojel, Yutong Bai, Trevor Darrell et al.
Vision-Language Action Knowledge Learning for Semantic-Aware Action Quality Assessment
Huangbiao Xu, Xiao Ke, Yuezhou Li et al.
DiffSal: Joint Audio and Video Learning for Diffusion Saliency Prediction
Junwen Xiong, Peng Zhang, Tao You et al.
Generalizable Task Representation Learning for Offline Meta-Reinforcement Learning with Data Limitations
Renzhe Zhou, Chen-Xiao Gao, Zongzhang Zhang et al.
TASeg: Temporal Aggregation Network for LiDAR Semantic Segmentation
Xiaopei Wu, Yuenan Hou, Xiaoshui Huang et al.
RoDUS: Robust Decomposition of Static and Dynamic Elements in Urban Scenes
Thang-Anh-Quan Nguyen, Luis G Roldao Jimenez, Nathan Piasco et al.
CoBIT: A Contrastive Bi-directional Image-Text Generation Model
Haoxuan You, Xiaoyue Guo, Zhecan Wang et al.
Learning to Learn Better Visual Prompts
Fengxiang Wang, Wanrong Huang, Shaowu Yang et al.
Spiking NeRF: Representing the Real-World Geometry by a Discontinuous Representation
Zhanfeng Liao, Yan Liu, Qian Zheng et al.
Find n' Propagate: Open-Vocabulary 3D Object Detection in Urban Environments
Djamahl Etchegaray, Zi Helen Huang, Tatsuya Harada et al.
Inspecting Prediction Confidence for Detecting Black-Box Backdoor Attacks
Tong Wang, Yuan Yao, Feng Xu et al.
Deep Structural Knowledge Exploitation and Synergy for Estimating Node Importance Value on Heterogeneous Information Networks
Yankai Chen, Yixiang Fang, Qiongyan Wang et al.
EntAugment: Entropy-Driven Adaptive Data Augmentation Framework for Image Classification
Suorong Yang, Furao Shen, Jian Zhao
A Restoration Network as an Implicit Prior
Yuyang Hu, Mauricio Delbracio, Peyman Milanfar et al.
AMD: Automatic Multi-step Distillation of Large-scale Vision Models
Cheng Han, Qifan Wang, Sohail A Dianat et al.
Retrieval-Guided Reinforcement Learning for Boolean Circuit Minimization
Animesh Basak Chowdhury, Marco Romanelli, Benjamin Tan et al.
LEIA: Latent View-invariant Embeddings for Implicit 3D Articulation
Archana Swaminathan, Anubhav Anubhav, Kamal Gupta et al.
Adversarial Backdoor Attack by Naturalistic Data Poisoning on Trajectory Prediction in Autonomous Driving
Mozhgan Pourkeshavarz, Mohammad Sabokrou, Amir Rasouli
FoSp: Focus and Separation Network for Early Smoke Segmentation
Lujian Yao, Haitao Zhao, Jingchao Peng et al.
LayoutDETR: Detection Transformer Is a Good Multimodal Layout Designer
Ning Yu, Chia-Chih Chen, Zeyuan Chen et al.
Towards Robust Image Stitching: An Adaptive Resistance Learning against Compatible Attacks
Zhiying Jiang, Xingyuan Li, Jinyuan Liu et al.
Estimating Noisy Class Posterior with Part-level Labels for Noisy Label Learning
Rui Zhao, Bin Shi, Jianfei Ruan et al.
HUMOS: Human Motion Model Conditioned on Body Shape
Shashank Tripathi, Omid Taheri, Christoph Lassner et al.