Most Cited 2024 "parameterized environment configurations" Papers
12,324 papers found • Page 10 of 62
Conference
Image Captioning with Multi-Context Synthetic Data
Feipeng Ma, Y. Zhou, Fengyun Rao et al.
Text-Enhanced Data-free Approach for Federated Class-Incremental Learning
Minh-Tuan Tran, Trung Le, Xuan-May Le et al.
Unmixing Diffusion for Self-Supervised Hyperspectral Image Denoising
Haijin Zeng, Jiezhang Cao, Yongyong Chen et al.
InsMapper: Exploring Inner-instance Information for Vectorized HD Mapping
Zhenhua Xu, Kwan-Yee K. Wong, Hengshuang ZHAO
Shadow Generation for Composite Image Using Diffusion Model
Qingyang Liu, Junqi You, Jian-Ting Wang et al.
FaceChain-ImagineID: Freely Crafting High-Fidelity Diverse Talking Faces from Disentangled Audio
Chao Xu, Yang Liu, Jiazheng Xing et al.
Adaptive Bounding Box Uncertainties via Two-Step Conformal Prediction
Alexander Timans, Christoph-Nikolas Straehle, Kaspar Sakmann et al.
Improving Virtual Try-On with Garment-focused Diffusion Models
Siqi Wan, Yehao Li, Jingwen Chen et al.
MESA: Matching Everything by Segmenting Anything
Yesheng Zhang, Xu Zhao
Dataset Enhancement with Instance-Level Augmentations
Orest Kupyn, Christian Rupprecht
FreePoint: Unsupervised Point Cloud Instance Segmentation
Zhikai Zhang, Jian Ding, Li Jiang et al.
AssistGUI: Task-Oriented PC Graphical User Interface Automation
Difei Gao, Lei Ji, Zechen Bai et al.
GSD: View-Guided Gaussian Splatting Diffusion for 3D Reconstruction
Yuxuan Mu, Xinxin Zuo, Chuan Guo et al.
Text Image Inpainting via Global Structure-Guided Diffusion Models
Shipeng Zhu, Pengfei Fang, Chenjie Zhu et al.
SemiReward: A General Reward Model for Semi-supervised Learning
Siyuan Li, Weiyang Jin, Zedong Wang et al.
A Simple and Effective Point-based Network for Event Camera 6-DOFs Pose Relocalization
Hongwei Ren, Jiadong Zhu, Yue Zhou et al.
MoST: Motion Style Transformer Between Diverse Action Contents
Boeun Kim, Jungho Kim, Hyung Jin Chang et al.
CC-SAM: Enhancing SAM with Cross-feature Attention and Context for Ultrasound Image Segmentation
Shreyank Narayana Gowda, David A Clifton
CoDA: Instructive Chain-of-Domain Adaptation with Severity-Aware Visual Prompt Tuning
Ziyang Gong, FuHao Li, Yupeng Deng et al.
SuperSVG: Superpixel-based Scalable Vector Graphics Synthesis
Teng Hu, Ran Yi, Baihong Qian et al.
STDiff: Spatio-Temporal Diffusion for Continuous Stochastic Video Prediction
Xi Ye, Guillaume-Alexandre Bilodeau
Implicit Concept Removal of Diffusion Models
Zhili LIU, Kai Chen, Yifan Zhang et al.
DVSAI: Diverse View-Shared Anchors Based Incomplete Multi-View Clustering
Shengju Yu, Siwei Wang, Pei Zhang et al.
Good Teachers Explain: Explanation-Enhanced Knowledge Distillation
Amin Parchami, Moritz Böhle, Sukrut Rao et al.
MM-Point: Multi-View Information-Enhanced Multi-Modal Self-Supervised 3D Point Cloud Understanding
HaiTao Yu, Mofei Song
Traffic Flow Optimisation for Lifelong Multi-Agent Path Finding
Zhe Chen, Daniel Harabor, Jiaoyang Li et al.
Synchronization is All You Need: Exocentric-to-Egocentric Transfer for Temporal Action Segmentation with Unlabeled Synchronized Video Pairs
Camillo Quattrocchi, Antonino Furnari, Daniele Di Mauro et al.
Blur-aware Spatio-temporal Sparse Transformer for Video Deblurring
Huicong Zhang, Haozhe Xie, Hongxun Yao
FedMef: Towards Memory-efficient Federated Dynamic Pruning
Hong Huang, Weiming Zhuang, Chen Chen et al.
Open-Set Domain Adaptation for Semantic Segmentation
Seun-An Choe, Ah-Hyung Shin, Keon Hee Park et al.
Learning to Detect Multi-class Anomalies with Just One Normal Image Prompt
Bin-Bin Gao
Temporally and Distributionally Robust Optimization for Cold-Start Recommendation
Xinyu Lin, Wenjie Wang, Jujia Zhao et al.
Crowd-SAM:SAM as a smart annotator for object detection in crowded scenes
Zhi Cai, Yingjie Gao, Yaoyan Zheng et al.
FAR: Flexible Accurate and Robust 6DoF Relative Camera Pose Estimation
Chris Rockwell, Nilesh Kulkarni, Linyi Jin et al.
Diverse Person: Customize Your Own Dataset for Text-Based Person Search
Zifan Song, Guosheng Hu, Cairong Zhao
Rotation-Agnostic Image Representation Learning for Digital Pathology
Saghir Alfasly, Abubakr Shafique, Peyman Nejat et al.
Relightable and Animatable Neural Avatars from Videos
Wenbin Lin, Chengwei Zheng, Jun-hai Yong et al.
Towards Faithful XAI Evaluation via Generalization-Limited Backdoor Watermark
Mengxi Ya, Yiming Li, Tao Dai et al.
Code-Style In-Context Learning for Knowledge-Based Question Answering
Zhijie Nie, Richong Zhang, Zhongyuan Wang et al.
HiPose: Hierarchical Binary Surface Encoding and Correspondence Pruning for RGB-D 6DoF Object Pose Estimation
Yongliang Lin, Yongzhi Su, Praveen Nathan et al.
Connecting Consistency Distillation to Score Distillation for Text-to-3D Generation
Zongrui Li, Minghui Hu, Qian Zheng et al.
Beta-Tuned Timestep Diffusion Model
Tianyi Zheng, Peng-Tao Jiang, Ben Wan et al.
Geometric-Facilitated Denoising Diffusion Model for 3D Molecule Generation
6428 Can Xu, Haosen Wang, Weigang Wang et al.
A Dual-Way Enhanced Framework from Text Matching Point of View for Multimodal Entity Linking
Shezheng Song, Shan Zhao, ChengYu Wang et al.
Beat-It: Beat-Synchronized Multi-Condition 3D Dance Generation
Zikai Huang, Xuemiao Xu, Cheng Xu et al.
Multiagent Multitraversal Multimodal Self-Driving: Open MARS Dataset
Yiming Li, Zhiheng Li, Nuo Chen et al.
Class Incremental Learning via Likelihood Ratio Based Task Prediction
Haowei Lin, Yijia Shao, Weinan Qian et al.
EgoCVR: An Egocentric Benchmark for Fine-Grained Composed Video Retrieval
Thomas Hummel, Shyamgopal Karthik, Mariana-Iuliana Georgescu et al.
BCLNet: Bilateral Consensus Learning for Two-View Correspondence Pruning
Xiangyang Miao, Guobao Xiao, Shiping Wang et al.
InfMAE: A Foundation Model in The Infrared Modality
Fangcen liu, Chenqiang Gao, Yaming Zhang et al.
Spectral-Based Graph Neutral Networks for Complementary Item Recommendation
Haitong Luo, Xuying Meng, Suhang Wang et al.
Cloud-Device Collaborative Learning for Multimodal Large Language Models
Guanqun Wang, Jiaming Liu, Chenxuan Li et al.
EAT: Towards Long-Tailed Out-of-Distribution Detection
Tong Wei, Bo-Lin Wang, Min-Ling Zhang
MagDiff: Multi-Alignment Diffusion for High-Fidelity Video Generation and Editing
Haoyu Zhao, Tianyi Lu, Jiaxi Gu et al.
Exploring the Transferability of Visual Prompting for Multimodal Large Language Models
Yichi Zhang, Yinpeng Dong, Siyuan Zhang et al.
DriveTrack: A Benchmark for Long-Range Point Tracking in Real-World Videos
Arjun Balasingam, Joseph Chandler, Chenning Li et al.
MetaCoCo: A New Few-Shot Classification Benchmark with Spurious Correlation
Min Zhang, Haoxuan Li, Fei Wu et al.
PhysPT: Physics-aware Pretrained Transformer for Estimating Human Dynamics from Monocular Videos
Yufei Zhang, Jeffrey Kephart, Zijun Cui et al.
T4P: Test-Time Training of Trajectory Prediction via Masked Autoencoder and Actor-specific Token Memory
Daehee Park, Jaeseok Jeong, Sung-Hoon Yoon et al.
Composing Object Relations and Attributes for Image-Text Matching
Khoi Pham, Chuong Huynh, Ser-Nam Lim et al.
Unprocessing Seven Years of Algorithmic Fairness
André F. Cruz, Moritz Hardt
UNIC: Universal Classification Models via Multi-teacher Distillation
Yannis Kalantidis, Larlus Diane, Mert Bulent SARIYILDIZ et al.
COMBAT: Alternated Training for Effective Clean-Label Backdoor Attacks
Tran Huynh, Dang Nguyen, Tung Pham et al.
TAMM: TriAdapter Multi-Modal Learning for 3D Shape Understanding
Zhihao Zhang, Shengcao Cao, Yu-Xiong Wang
PartSTAD: 2D-to-3D Part Segmentation Task Adaptation
Hyunjin Kim, Minhyuk Sung
RMem: Restricted Memory Banks Improve Video Object Segmentation
Junbao Zhou, Ziqi Pang, Yu-Xiong Wang
Diffusion Model is a Good Pose Estimator from 3D RF-Vision
Junqiao Fan, Jianfei Yang, Yuecong Xu et al.
Revisiting Adversarial Training Under Long-Tailed Distributions
Xinli Yue, Ningping Mou, Qian Wang et al.
Stitching Sub-trajectories with Conditional Diffusion Model for Goal-Conditioned Offline RL
Sungyoon Kim, Yunseon Choi, Daiki Matsunaga et al.
Unsupervised Layer-Wise Score Aggregation for Textual OOD Detection
Maxime Darrin, Guillaume Staerman, Eduardo Dadalto Camara Gomes et al.
Masked Video and Body-worn IMU Autoencoder for Egocentric Action Recognition
Mingfang Zhang, Yifei Huang, Ruicong Liu et al.
Adapting Short-Term Transformers for Action Detection in Untrimmed Videos
Min Yang, gaohuan, Ping Guo et al.
Adaptive VIO: Deep Visual-Inertial Odometry with Online Continual Learning
Youqi Pan, Wugen Zhou, Yingdian Cao et al.
LMUFormer: Low Complexity Yet Powerful Spiking Model With Legendre Memory Units
Zeyu Liu, Gourav Datta, Anni Li et al.
Weakly Supervised Semantic Segmentation for Driving Scenes
Dongseob Kim, Seungho Lee, Junsuk Choe et al.
Music Style Transfer with Time-Varying Inversion of Diffusion Models
Sifei Li, Yuxin Zhang, Fan Tang et al.
Visual Alignment Pre-training for Sign Language Translation
Peiqi Jiao, Yuecong Min, Xilin CHEN
SPEAL: Skeletal Prior Embedded Attention Learning for Cross-Source Point Cloud Registration
Kezheng Xiong, Maoji Zheng, Qingshan Xu et al.
LRANet: Towards Accurate and Efficient Scene Text Detection with Low-Rank Approximation
Yuchen Su, Zhineng Chen, Zhiwen Shao et al.
Deep Diffusion Image Prior for Efficient OOD Adaptation in 3D Inverse Problems
Hyungjin Chung, Jong Chul Ye
Condition-Aware Neural Network for Controlled Image Generation
Han Cai, Muyang Li, Qinsheng Zhang et al.
Decompose-and-Compose: A Compositional Approach to Mitigating Spurious Correlation
Fahimeh Hosseini Noohdani, Parsa Hosseini, Aryan Yazdan Parast et al.
Every Node Is Different: Dynamically Fusing Self-Supervised Tasks for Attributed Graph Clustering
Pengfei Zhu, Qian Wang, Yu Wang et al.
SFPNet: Sparse Focal Point Network for Semantic Segmentation on General LiDAR Point Clouds
Yanbo Wang, Wentao Zhao, Cao Chuan et al.
CLOSER: Towards Better Representation Learning for Few-Shot Class-Incremental Learning
Junghun Oh, Sungyong Baik, Kyoung Mu Lee
Label-Agnostic Forgetting: A Supervision-Free Unlearning in Deep Models
Shaofei Shen, Chenhao Zhang, Yawen Zhao et al.
HAVE-FUN: Human Avatar Reconstruction from Few-Shot Unconstrained Images
Xihe Yang, Xingyu Chen, Daiheng Gao et al.
Understanding Video Transformers via Universal Concept Discovery
Matthew Kowal, Achal Dave, Rares Andrei Ambrus et al.
LiFT: A Surprisingly Simple Lightweight Feature Transform for Dense ViT Descriptors
Saksham Suri, Matthew Walmer, Kamal Gupta et al.
CycleINR: Cycle Implicit Neural Representation for Arbitrary-Scale Volumetric Super-Resolution of Medical Data
Wei Fang, Yuxing Tang, Heng Guo et al.
Emergent Visual-Semantic Hierarchies in Image-Text Representations
Morris Alper, Hadar Averbuch-Elor
Towards Green AI in Fine-tuning Large Language Models via Adaptive Backpropagation
Kai Huang, Hanyun Yin, Heng Huang et al.
Revisiting Document-Level Relation Extraction with Context-Guided Link Prediction
Monika Jain, Raghava Mutharaju, Ramakanth Kavuluru et al.
HandDiff: 3D Hand Pose Estimation with Diffusion on Image-Point Cloud
WENCAN CHENG, Hao Tang, Luc Van Gool et al.
DIRECT-3D: Learning Direct Text-to-3D Generation on Massive Noisy 3D Data
Qihao Liu, Yi Zhang, Song Bai et al.
Towards Effective Usage of Human-Centric Priors in Diffusion Models for Text-based Human Image Generation
Junyan Wang, Zhenhong Sun, Stewart Tan et al.
Image Inpainting via Iteratively Decoupled Probabilistic Modeling
Wenbo Li, Xin Yu, Kun Zhou et al.
Decomposing Semantic Shifts for Composed Image Retrieval
Xingyu Yang, Daqing Liu, Heng Zhang et al.
Understanding Augmentation-based Self-Supervised Representation Learning via RKHS Approximation and Regression
Runtian Zhai, Bingbin Liu, Andrej Risteski et al.
Explain via Any Concept: Concept Bottleneck Model with Open Vocabulary Concepts
Andong Tan, Fengtao Zhou, Hao Chen
Towards Understanding Factual Knowledge of Large Language Models
Xuming Hu, Junzhe Chen, Xiaochuan Li et al.
De-confounded Data-free Knowledge Distillation for Handling Distribution Shifts
Yuzheng Wang, Dingkang Yang, Zhaoyu Chen et al.
PetFace: A Large-Scale Dataset and Benchmark for Animal Identification
Risa Shinoda, Kaede Shiohara
City-on-Web: Real-time Neural Rendering of Large-scale Scenes on the Web
Kaiwen Song, Xiaoyi Zeng, Chenqu Ren et al.
Neural Visibility Field for Uncertainty-Driven Active Mapping
Shangjie Xue, Jesse Dill, Pranay Mathur et al.
Enhancing 3D Object Detection with 2D Detection-Guided Query Anchors
Haoxuanye Ji, Pengpeng Liang, Erkang Cheng
RAW-Adapter: Adapting Pretrained Visual Model to Camera RAW Images
Ziteng Cui, Tatsuya Harada
PreRoutGNN for Timing Prediction with Order Preserving Partition: Global Circuit Pre-training, Local Delay Learning and Attentional Cell Modeling
Ruizhe Zhong, Junjie Ye, Zhentao Tang et al.
Raindrop Clarity: A Dual-Focused Dataset for Day and Night Raindrop Removal
Yeying Jin, Xin Li, Jiadong Wang et al.
COPlanner: Plan to Roll Out Conservatively but to Explore Optimistically for Model-Based RL
Xiyao Wang, Ruijie Zheng, Yanchao Sun et al.
Unlocking the Potential of Federated Learning: The Symphony of Dataset Distillation via Deep Generative Latents
Yuqi Jia, Saeed Vahidian, Jingwei Sun et al.
Data Valuation and Detections in Federated Learning
Wenqian Li, Shuran Fu, Fengrui Zhang et al.
Bongard-OpenWorld: Few-Shot Reasoning for Free-form Visual Concepts in the Real World
Rujie Wu, Xiaojian Ma, Zhenliang Zhang et al.
Exploring Pre-trained Text-to-Video Diffusion Models for Referring Video Object Segmentation
Xuelu Feng, Dongdong Chen, Junsong Yuan et al.
PosFormer: Recognizing Complex Handwritten Mathematical Expression with Position Forest Transformer
Tongkun Guan, Chengyu Lin, Wei Shen et al.
Knowledge Transfer with Simulated Inter-Image Erasing for Weakly Supervised Semantic Segmentation
Tao Chen, Xiruo Jiang, Gensheng Pei et al.
Self-Supervised Video Desmoking for Laparoscopic Surgery
Renlong Wu, Zhilu Zhang, Shuohao Zhang et al.
Crystalformer: Infinitely Connected Attention for Periodic Structure Encoding
Tatsunori Taniai, Ryo Igarashi, Yuta Suzuki et al.
Keypoint Promptable Re-Identification
Vladimir Somers, Alexandre ALahi, Christophe De Vleeschouwer
One-stage Prompt-based Continual Learning
Youngeun Kim, YUHANG LI, Priyadarshini Panda
PAIR Diffusion: A Comprehensive Multimodal Object-Level Image Editor
Vidit Goel, Elia Peruzzo, Yifan Jiang et al.
De-Diffusion Makes Text a Strong Cross-Modal Interface
Chen Wei, Chenxi Liu, Siyuan Qiao et al.
RGMComm: Return Gap Minimization via Discrete Communications in Multi-Agent Reinforcement Learning
Jingdi Chen, Tian Lan, Carlee Joe-Wong
Motion-prior Contrast Maximization for Dense Continuous-Time Motion Estimation
Friedhelm Hamann, Ziyun Wang, Ioannis Asmanis et al.
Language-Driven 6-DoF Grasp Detection Using Negative Prompt Guidance
Tien Toan Nguyen, Minh Nhat Nhat Vu, Baoru Huang et al.
UniM2AE: Multi-modal Masked Autoencoders with Unified 3D Representation for 3D Perception in Autonomous Driving
Jian Zou, Tianyu Huang, Guanglei Yang et al.
CoReS: Orchestrating the Dance of Reasoning and Segmentation
Xiaoyi Bao, Siyang Sun, Shuailei Ma et al.
SuperNormal: Neural Surface Reconstruction via Multi-View Normal Integration
Xu Cao, Takafumi Taketomi
Differentiable Information Bottleneck for Deterministic Multi-view Clustering
Xiaoqiang Yan, Zhixiang Jin, Fengshou Han et al.
Listen to Look into the Future: Audio-Visual Egocentric Gaze Anticipation
Bolin Lai, Fiona Ryan, Wenqi Jia et al.
M2D2M: Multi-Motion Generation from Text with Discrete Diffusion Models
Seunggeun Chi, Hyung-gun Chi, Hengbo Ma et al.
Morphable Diffusion: 3D-Consistent Diffusion for Single-image Avatar Creation
Xiyi Chen, Marko Mihajlovic, Shaofei Wang et al.
What Makes a Good Prune? Maximal Unstructured Pruning for Maximal Cosine Similarity
Gabryel Mason-Williams, Fredrik Dahlqvist
Scaling and Masking: A New Paradigm of Data Sampling for Image and Video Quality Assessment
Yongxu Liu, Yinghui Quan, Guoyao Xiao et al.
Towards Multimodal Open-Set Domain Generalization and Adaptation through Self-supervision
Hao Dong, Eleni Chatzi, Olga Fink
DeiT-LT: Distillation Strikes Back for Vision Transformer Training on Long-Tailed Datasets
Harsh Rangwani, Pradipto Mondal, Mayank Mishra et al.
Latent Diffusion Prior Enhanced Deep Unfolding for Snapshot Spectral Compressive Imaging
Zongliang Wu, Ruiying Lu, Ying Fu et al.
PromptIQA: Boosting the Performance and Generalization for No-Reference Image Quality Assessment via Prompts
Zewen Chen, Haina Qin, Juan Wang et al.
SfmCAD: Unsupervised CAD Reconstruction by Learning Sketch-based Feature Modeling Operations
Pu Li, Jianwei Guo, HUIBIN LI et al.
Mismatch Quest: Visual and Textual Feedback for Image-Text Misalignment
Brian Gordon, Yonatan Bitton, Yonatan Shafir et al.
What Effects the Generalization in Visual Reinforcement Learning: Policy Consistency with Truncated Return Prediction
Shuo Wang, Zhihao Wu, X. Hu et al.
ConsistentEE: A Consistent and Hardness-Guided Early Exiting Method for Accelerating Language Models Inference
Ziqian Zeng, Yihuai Hong, Hongliang Dai et al.
Towards Latent Masked Image Modeling for Self-Supervised Visual Representation Learning
Yibing Wei, Abhinav Gupta, Pedro Morgado
Quadratic models for understanding catapult dynamics of neural networks
Libin Zhu, Chaoyue Liu, Adityanarayanan Radhakrishnan et al.
Taming Latent Diffusion Model for Neural Radiance Field Inpainting
Chieh Lin, Changil Kim, Jia-Bin Huang et al.
Inverse Rendering of Glossy Objects via the Neural Plenoptic Function and Radiance Fields
Haoyuan Wang, Wenbo Hu, Lei Zhu et al.
FRIH: Fine-Grained Region-Aware Image Harmonization
Jinlong Peng, Zekun Luo, Liang Liu et al.
Day-Night Cross-domain Vehicle Re-identification
Hongchao Li, Jingong Chen, AIHUA ZHENG et al.
Token Transformation Matters: Towards Faithful Post-hoc Explanation for Vision Transformer
Junyi Wu, Bin Duan, Weitai Kang et al.
Exploring Vision Transformers for 3D Human Motion-Language Models with Motion Patches
Qing Yu, Mikihiro Tanaka, Kent Fujiwara
PARE-Net: Position-Aware Rotation-Equivariant Networks for Robust Point Cloud Registration
Runzhao Yao, Shaoyi Du, Wenting Cui et al.
SURER: Structure-Adaptive Unified Graph Neural Network for Multi-View Clustering
Jing Wang, Songhe Feng, Gengyu Lyu et al.
Optimizing Diffusion Models for Joint Trajectory Prediction and Controllable Generation
Yixiao Wang, Chen Tang, Lingfeng Sun et al.
VividDreamer: Invariant Score Distillation for Hyper-Realistic Text-to-3D Generation
Wenjie Zhuo, Fan Ma, Hehe Fan et al.
SuperGaussian: Repurposing Video Models for 3D Super Resolution
Yuan Shen, Duygu Ceylan, Paul Guerrero et al.
Align Before Adapt: Leveraging Entity-to-Region Alignments for Generalizable Video Action Recognition
Yifei Chen, Dapeng Chen, Ruijin Liu et al.
Combating Data Imbalances in Federated Semi-supervised Learning with Dual Regulators
Sikai Bai, Shuaicheng Li, Weiming Zhuang et al.
Controllable Navigation Instruction Generation with Chain of Thought Prompting
Xianghao Kong, Jinyu Chen, Wenguan Wang et al.
Guess The Unseen: Dynamic 3D Scene Reconstruction from Partial 2D Glimpses
Inhee Lee, Byungjun Kim, Hanbyul Joo
CADTalk: An Algorithm and Benchmark for Semantic Commenting of CAD Programs
Haocheng Yuan, Jing Xu, Hao Pan et al.
LocLLM: Exploiting Generalizable Human Keypoint Localization via Large Language Model
Dongkai Wang, shiyu xuan, Shiliang Zhang
DPMesh: Exploiting Diffusion Prior for Occluded Human Mesh Recovery
Yixuan Zhu, Ao Li, Yansong Tang et al.
Evidential Active Recognition: Intelligent and Prudent Open-World Embodied Perception
Lei Fan, Mingfu Liang, Yunxuan Li et al.
Learning Encodings for Constructive Neural Combinatorial Optimization Needs to Regret
Rui Sun, Zhi Zheng, Zhenkun Wang
Comparing the Robustness of Modern No-Reference Image- and Video-Quality Metrics to Adversarial Attacks
Anastasia Antsiferova, Khaled Abud, Aleksandr Gushchin et al.
Heterogeneous Graph Reasoning for Fact Checking over Texts and Tables
Haisong Gong, Weizhi Xu, Shu Wu et al.
Self-Supervised Multi-Modal Knowledge Graph Contrastive Hashing for Cross-Modal Search
Meiyu Liang, Junping Du, Zhengyang Liang et al.
SWAP-NAS: Sample-Wise Activation Patterns for Ultra-fast NAS
Yameng Peng, Andy Song, Haytham Fayek et al.
A Comprehensive Augmentation Framework for Anomaly Detection
Lin Jiang, Yaping Yan
Progressive Poisoned Data Isolation for Training-Time Backdoor Defense
Yiming Chen, Haiwei Wu, Jiantao Zhou
Thermal3D-GS: Physics-induced 3D Gaussians for Thermal Infrared Novel-view Synthesis
Qian Chen, Shihao Shu, Xiangzhi Bai
Beyond MOT: Semantic Multi-Object Tracking
Yunhao Li, Qin Li, Hao Wang et al.
ProxyDet: Synthesizing Proxy Novel Classes via Classwise Mixup for Open Vocabulary Object Detection
Joonhyun Jeong, Geondo Park, Jayeon Yoo et al.
IS-DARTS: Stabilizing DARTS through Precise Measurement on Candidate Importance
Hongyi He, Longjun Liu, Haonan Zhang et al.
DiffusionNAG: Predictor-guided Neural Architecture Generation with Diffusion Models
Sohyun An, Hayeon Lee, Jaehyeong Jo et al.
Visual Concept Connectome (VCC): Open World Concept Discovery and their Interlayer Connections in Deep Models
Matthew Kowal, Richard P. Wildes, Kosta Derpanis
Select and Distill: Selective Dual-Teacher Knowledge Transfer for Continual Learning on Vision-Language Models
Yu-Chu Yu, Chi-Pin Huang, Jr-Jen Chen et al.
The Hard Positive Truth about Vision-Language Compositionality
Amita Kamath, Cheng-Yu Hsieh, Kai-Wei Chang et al.
R-MAE: Regions Meet Masked Autoencoders
Duy-Kien Nguyen, Yanghao Li, Vaibhav Aggarwal et al.
Transformer-Based Selective Super-resolution for Efficient Image Refinement
Tianyi Zhang, Kishore Kasichainula, Yaoxin Zhuo et al.
Iterated Learning Improves Compositionality in Large Vision-Language Models
Chenhao Zheng, Jieyu Zhang, Aniruddha Kembhavi et al.
Stable Unlearnable Example: Enhancing the Robustness of Unlearnable Examples via Stable Error-Minimizing Noise
Yixin Liu, Kaidi Xu, Xun Chen et al.
Efficient Meshflow and Optical Flow Estimation from Event Cameras
Xinglong Luo, Ao Luo, Zhengning Wang et al.
Learning to Optimize Permutation Flow Shop Scheduling via Graph-Based Imitation Learning
Longkang Li, Siyuan Liang, Zihao Zhu et al.
TimeLens-XL: Real-time Event-based Video Frame Interpolation with Large Motion
Shi Guo, Yutian Chen, Tianfan Xue et al.
C^2RV: Cross-Regional and Cross-View Learning for Sparse-View CBCT Reconstruction
Yiqun Lin, Jiewen Yang, hualiang wang et al.
Semi-supervised Active Learning for Video Action Detection
Ayush Singh, Aayush J Rana, Akash Kumar et al.
Object Pose Estimation via the Aggregation of Diffusion Features
Tianfu Wang, Guosheng Hu, Hongguang Wang
SHINOBI: Shape and Illumination using Neural Object Decomposition via BRDF Optimization In-the-wild
Andreas Engelhardt, Amit Raj, Mark Boss et al.
Joint Demosaicing and Denoising for Spike Camera
Yanchen Dong, Ruiqin Xiong, Jing Zhao et al.
Lazy Diffusion Transformer for Interactive Image Editing
Yotam Nitzan, Zongze Wu, Richard Zhang et al.
Review-Enhanced Hierarchical Contrastive Learning for Recommendation
Ke Wang, Yanmin Zhu, Tianzi Zang et al.
Tackling Structural Hallucination in Image Translation with Local Diffusion
Seunghoi Kim, Chen Jin, Tom Diethe et al.
Image Demoireing in RAW and sRGB Domains
Shuning Xu, Binbin Song, Xiangyu Chen et al.
Efficient Image Pre-Training with Siamese Cropped Masked Autoencoders
Alexandre Eymaël, Renaud Vandeghen, Anthony Cioppa et al.
Revealing the Proximate Long-Tail Distribution in Compositional Zero-Shot Learning
Chenyi Jiang, Haofeng Zhang
Video Prediction by Modeling Videos as Continuous Multi-Dimensional Processes
Gaurav Shrivastava, Abhinav Shrivastava
Auto-GAS: Automated Proxy Discovery for Training-free Generative Architecture Search
Lujun Li, Haosen SUN, Shiwen Li et al.
An Efficient and Effective Transformer Decoder-Based Framework for Multi-Task Visual Grounding
Wei Chen, Long Chen, Yu Wu
Context Diffusion: In-Context Aware Image Generation
Ivona Najdenkoska, Animesh Sinha, Abhimanyu Dubey et al.