Most Cited 2024 "video transmission optimization" Papers
12,324 papers found • Page 14 of 62
Conference
Learning to Pivot as a Smart Expert
Tianhao Liu, Shanwen Pu, Dongdong Ge et al.
Non-parametric Representation Learning with Kernels
Hebaixu Wang, Meiqi Gong, Xiaoguang Mei et al.
Bias-Conflict Sample Synthesis and Adversarial Removal Debias Strategy for Temporal Sentence Grounding in Video
Zhaobo Qi, Yibo Yuan, Xiaowen Ruan et al.
TCAN: Animating Human Images with Temporally Consistent Pose Guidance using Diffusion Models
Jeongho Kim, Min-Jung Kim, Junsoo Lee et al.
Efficient Few-Shot Action Recognition via Multi-Level Post-Reasoning
Cong Wu, Xiao-Jun Wu, Linze Li et al.
SAVE: Protagonist Diversification with Structure Agnostic Video Editing
Yeji Song, Wonsik Shin, Junsoo Lee et al.
DreamMesh: Jointly Manipulating and Texturing Triangle Meshes for Text-to-3D Generation
Haibo Yang, Yang Chen, Yingwei Pan et al.
Privacy-Preserving Optics for Enhancing Protection in Face De-Identification
Jhon Lopez, Carlos Hinojosa, Henry Arguello et al.
CLAP: Isolating Content from Style through Contrastive Learning with Augmented Prompts
Yichao Cai, Yuhang Liu, Zhen Zhang et al.
SeGA: Preference-Aware Self-Contrastive Learning with Prompts for Anomalous User Detection on Twitter
Ying-Ying Chang, Wei-Yao Wang, Wen-Chih Peng
Monocular Occupancy Prediction for Scalable Indoor Scenes
Hongxiao Yu, Yuqi Wang, Yuntao Chen et al.
Unlocking Attributes' Contribution to Successful Camouflage: A Combined Textual and Visual Analysis Strategy
Hong Zhang, Yixuan Lyu, Qian Yu et al.
SwapAnything: Enabling Arbitrary Object Swapping in Personalized Image Editing
Jing Gu, Nanxuan Zhao, Wei Xiong et al.
Global Counterfactual Directions
Bartlomiej Sobieski, Przemyslaw Biecek
How to Train the Teacher Model for Effective Knowledge Distillation
Shayan Mohajer Hamidi, Xizhen Deng, Renhao Tan et al.
MetaRLEC: Meta-Reinforcement Learning for Discovery of Brain Effective Connectivity
Zuozhen Zhang, Junzhong Ji, Jinduo Liu
OneVOS: Unifying Video Object Segmentation with All-in-One Transformer Framework
Wanyun Li, Pinxue Guo, Xinyu Zhou et al.
Robust Depth Enhancement via Polarization Prompt Fusion Tuning
Kei IKEMURA, Yiming Huang, Felix Heide et al.
StraightPCF: Straight Point Cloud Filtering
Dasith de Silva Edirimuni, Xuequan Lu, Gang Li et al.
Diffusion Models are Geometry Critics: Single Image 3D Editing Using Pre-Trained Diffusion Priors
Ruicheng Wang, Jianfeng Xiang, Jiaolong Yang et al.
Clockwork Diffusion: Efficient Generation With Model-Step Distillation
Amirhossein Habibian, Amir Ghodrati, Noor Fathima et al.
Learning Spatially Collaged Fourier Bases for Implicit Neural Representation
Jason Chun Lok Li, Chang Liu, Binxiao Huang et al.
Variational Inference for SDEs Driven by Fractional Noise
Rembert Daems, Manfred Opper, Guillaume Crevecoeur et al.
BEAF: Observing BEfore-AFter Changes to Evaluate Hallucination in Vision-language Models
Ye-Bin Moon, Nam Hyeon-Woo, Wonseok Choi et al.
Multiscale Sliced Wasserstein Distances as Perceptual Color Difference Measures
Jiaqi He, Zhihua Wang, Leon Wang et al.
Understanding Inter-Concept Relationships in Concept-Based Models
Naveen Raman, Mateo Espinosa Zarlenga, Mateja Jamnik
Idempotent Unsupervised Representation Learning for Skeleton-Based Action Recognition
Lilang Lin, Lehong Wu, Jiahang Zhang et al.
Open Vocabulary 3D Scene Understanding via Geometry Guided Self-Distillation
Pengfei Wang, Yuxi Wang, Shuai Li et al.
Towards Automated Movie Trailer Generation
Dawit Argaw Argaw, Mattia Soldan, Alejandro Pardo et al.
NOVUM: Neural Object Volumes for Robust Object Classification
Artur Jesslen, Guofeng Zhang, Angtian Wang et al.
AdaDistill: Adaptive Knowledge Distillation for Deep Face Recognition
Fadi Boutros, Vitomir Struc, Naser Damer
Shape2Scene: 3D Scene Representation Learning Through Pre-training on Shape Data
Tuo FENG, Wenguan Wang, Ruijie Quan et al.
ByteEdit: Boost, Comply and Accelerate Generative Image Editing
YUXI REN, Jie Wu, Yanzuo Lu et al.
Emerging Property of Masked Token for Effective Pre-training
Hyesong Choi, Hunsang Lee, Seyoung Joung et al.
DiscoMatch: Fast Discrete Optimisation for Geometrically Consistent 3D Shape Matching
Paul Roetzer, Ahmed Abbas, Dongliang Cao et al.
Reflective Instruction Tuning: Mitigating Hallucinations in Large Vision-Language Models
Jinrui Zhang, Teng Wang, Haigang Zhang et al.
Text-guided Explorable Image Super-resolution
Kanchana Vaishnavi Gandikota, Paramanand Chandramouli
GeneAvatar: Generic Expression-Aware Volumetric Head Avatar Editing from a Single Image
Chong Bao, Yinda Zhang, Yuan Li et al.
Learning from One Continuous Video Stream
Joao Carreira, Michael King, Viorica Patraucean et al.
Mind The Edge: Refining Depth Edges in Sparsely-Supervised Monocular Depth Estimation
Lior Talker, Aviad Cohen, Erez Yosef et al.
Self-Adapting Large Visual-Language Models to Edge Devices across Visual Modalities
Kaiwen Cai, ZheKai Duan, Gaowen Liu et al.
Backdoor Contrastive Learning via Bi-level Trigger Optimization
Weiyu Sun, Xinyu Zhang, Hao LU et al.
Volumetric Rendering with Baked Quadrature Fields
Gopal Sharma, Daniel Rebain, Kwang Moo Yi et al.
Zero-Shot Structure-Preserving Diffusion Model for High Dynamic Range Tone Mapping
Ruoxi Zhu, Shusong Xu, Peiye Liu et al.
YolOOD: Utilizing Object Detection Concepts for Multi-Label Out-of-Distribution Detection
Alon Zolfi, Guy AmiT, Amit Baras et al.
Safe-CLIP: Removing NSFW Concepts from Vision-and-Language Models
Samuele Poppi, Tobia Poppi, Federico Cocchi et al.
Motion and Structure from Event-based Normal Flow
Zhongyang Ren, Bangyan Liao, Delei Kong et al.
Characteristics Matching Based Hash Codes Generation for Efficient Fine-grained Image Retrieval
Zhen-Duo Chen, Li-Jun Zhao, Zi-Chao Zhang et al.
Distilling ODE Solvers of Diffusion Models into Smaller Steps
Sanghwan Kim, Hao Tang, Fisher Yu
Few-shot NeRF by Adaptive Rendering Loss Regularization
Qingshan Xu, Xuanyu Yi, Jianyao Xu et al.
Multi-Person Pose Forecasting with Individual Interaction Perceptron and Prior Learning
Peng Xiao, Yi Xie, Xuemiao Xu et al.
Cumulative Regret Analysis of the Piyavskii–Shubert Algorithm and Its Variants for Global Optimization
Kaan Gokcesu, Hakan Gökcesu
Improved Bandits in Many-to-One Matching Markets with Incentive Compatibility
Fang Kong, Shuai Li
Free Lunch for Gait Recognition: A Novel Relation Descriptor
Jilong Wang, Saihui Hou, Yan Huang et al.
SIRA: Scalable Inter-frame Relation and Association for Radar Perception
Ryoma Yataka, Pu Wang, Petros Boufounos et al.
Understanding and Improving Optimization in Predictive Coding Networks
Nicholas Alonso, Jeffrey Krichmar, Emre Neftci
PARSAC: Accelerating Robust Multi-Model Fitting with Parallel Sample Consensus
Florian Kluger, Bodo Rosenhahn
VQ-HPS: Human Pose and Shape Estimation in a Vector-Quantized Latent Space
Guénolé Fiche, Simon Leglaive, Xavier Alameda-Pineda et al.
Neural Point Cloud Diffusion for Disentangled 3D Shape and Appearance Generation
Philipp Schröppel, Christopher Wewer, Jan Lenssen et al.
Ponymation: Learning Articulated 3D Animal Motions from Unlabeled Online Videos
Keqiang Sun, Dori Litvak, Yunzhi Zhang et al.
Imaging Interiors: An Implicit Solution to Electromagnetic Inverse Scattering Problems
Ziyuan Luo, Boxin Shi, Haoliang Li et al.
Real-time 3D-aware Portrait Editing from a Single Image
Qingyan Bai, Zifan Shi, Yinghao Xu et al.
Exploring the Feature Extraction and Relation Modeling For Light-Weight Transformer Tracking
Jikai Zheng, Mingjiang Liang, Shaoli Huang et al.
Adversarially Robust Distillation by Reducing the Student-Teacher Variance Gap
Junhao Dong, Piotr Koniusz, Junxi Chen et al.
Smart Help: Strategic Opponent Modeling for Proactive and Adaptive Robot Assistance in Households
Zhihao Cao, ZiDong Wang, Siwen Xie et al.
Massively Scalable Inverse Reinforcement Learning in Google Maps
Matt Barnes, Matthew Abueg, Oliver Lange et al.
Sharpness-Aware Minimization Enhances Feature Quality via Balanced Learning
Jacob Springer, Vaishnavh Nagarajan, Aditi Raghunathan
Enhancing Recipe Retrieval with Foundation Models: A Data Augmentation Perspective
Fangzhou Song, Bin Zhu, Yanbin Hao et al.
Dual-Scale Transformer for Large-Scale Single-Pixel Imaging
Gang Qu, Ping Wang, Xin Yuan
KDProR: A Knowledge-Decoupling Probabilistic Framework for Video-Text Retrieval
Xianwei Zhuang, Hongxiang Li, Xuxin Cheng et al.
Low-Light Face Super-resolution via Illumination, Structure, and Texture Associated Representation
Chenyang Wang, Junjun Jiang, Kui Jiang et al.
FSC: Few-point Shape Completion
Xianzu Wu, Xianfeng Wu, Tianyu Luan et al.
Step Differences in Instructional Video
Tushar Nagarajan, Lorenzo Torresani
VVS: Video-to-Video Retrieval with Irrelevant Frame Suppression
Won Jo, Geuntaek Lim, Gwangjin Lee et al.
SHiNe: Semantic Hierarchy Nexus for Open-vocabulary Object Detection
Mingxuan Liu, Tyler Hayes, Elisa Ricci et al.
Generating Physically Realistic and Directable Human Motions from Multi-Modal Inputs
Aayam Shrestha, Pan Liu, German Ros et al.
OctOcc: High-Resolution 3D Occupancy Prediction with Octree
Wenzhe Ouyang, Xiaolin Song, Bailan Feng et al.
Benchmarking Implicit Neural Representation and Geometric Rendering in Real-Time RGB-D SLAM
Tongyan Hua, Addison, Lin Wang
DNI: Dilutional Noise Initialization for Diffusion Video Editing
Sunjae Yoon, Gwanhyeong Koo, Ji Woo Hong et al.
Improving Open Set Recognition via Visual Prompts Distilled from Common-Sense Knowledge
Seong-Tae Kim, Hyungil Kim, Y. Ro
CanonicalFusion: Generating Drivable 3D Human Avatars from Multiple Images
Jisu Shin, Junmyeong Lee, Seongmin Lee et al.
Rasterized Edge Gradients: Handling Discontinuities Differentially
Stanislav Pidhorskyi, Tomas Simon, Gabriel Schwartz et al.
Colour Passing Revisited: Lifted Model Construction with Commutative Factors
Malte Luttermann, Tanya Braun, Ralf Möller et al.
Towards More Practical Group Activity Detection: A New Benchmark and Model
Dongkeun Kim, Youngkil Song, Minsu Cho et al.
MAGR: Manifold-Aligned Graph Regularization for Continual Action Quality Assessment
Kanglei Zhou, Liyuan Wang, Xingxing Zhang et al.
Towards Modeling Uncertainties of Self-explaining Neural Networks via Conformal Prediction
Wei Qian, Chenxu Zhao, Yangyi Li et al.
Learning to Make Keypoints Sub-Pixel Accurate
Shinjeong Kim, Marc Pollefeys, Daniel Barath
Improving Visual Recognition with Hyperbolical Visual Hierarchy Mapping
Hyeongjun Kwon, Jinhyun Jang, Jin Kim et al.
Uncertainty-aware sign language video retrieval with probability distribution modeling
Xuan Wu, Hongxiang Li, yuanjiang luo et al.
Grab What You Need: Rethinking Complex Table Structure Recognition with Flexible Components Deliberation
Hao Liu, Xin Li, Mingming Gong et al.
SpikeNeRF: Learning Neural Radiance Fields from Continuous Spike Stream
Lin Zhu, Kangmin Jia, Yifan Zhao et al.
Towards Accurate and Robust Architectures via Neural Architecture Search
Yuwei Ou, Yuqi Feng, Yanan Sun
RG-GAN: Dynamic Regenerative Pruning for Data-Efficient Generative Adversarial Networks
Divya Saxena, Jiannong Cao, Jiahao Xu et al.
Estimating Conditional Mutual Information for Dynamic Feature Selection
Soham Gadgil, Ian Covert, Su-In Lee
ViPer: Visual Personalization of Generative Models via Individual Preference Learning
Sogand Salehi, Mahdi Shafiei, Roman Bachmann et al.
Recursive Visual Programming
Jiaxin Ge, Sanjay Subramanian, Baifeng Shi et al.
Leveraging Thermal Modality to Enhance Reconstruction in Low-Light Conditions
Jiacong Xu, Mingqian Liao, Ram Prabhakar Kathirvel et al.
Leveraging temporal contextualization for video action recognition
Minji Kim, Dongyoon Han, Taekyung Kim et al.
Single Mesh Diffusion Models with Field Latents for Texture Generation
Thomas W. Mitchel, Carlos Esteves, Ameesh Makadia
Domain Shifting: A Generalized Solution for Heterogeneous Cross-Modality Person Re-Identification
Yan Jiang, Xu Cheng, Hao Yu et al.
Unsigned Orthogonal Distance Fields: An Accurate Neural Implicit Representation for Diverse 3D Shapes
YuJie Lu, Long Wan, Nayu Ding et al.
Omnipotent Distillation with LLMs for Weakly-Supervised Natural Language Video Localization:
Peijun Bao, Zihao Shao, Wenhan Yang et al.
Symmetric Self-Paced Learning for Domain Generalization
Di Zhao, Yun Sing Koh, Gillian Dobbie et al.
Learning from Sparse Offline Datasets via Conservative Density Estimation
Zhepeng Cen, Zuxin Liu, Zitong Wang et al.
Multi-Domain Recommendation to Attract Users via Domain Preference Modeling
Hyunjun Ju, SeongKu Kang, Dongha Lee et al.
Adaptive Anytime Multi-Agent Path Finding Using Bandit-Based Large Neighborhood Search
Thomy Phan, Taoan Huang, Bistra Dilkina et al.
SaCo Loss: Sample-wise Affinity Consistency for Vision-Language Pre-training
WU Sitong, Haoru Tan, Zhuotao Tian et al.
Towards Optimal Subsidy Bounds for Envy-Freeable Allocations
Yasushi Kawase, Kazuhisa Makino, Hanna Sumita et al.
Coupling Graph Neural Networks with Fractional Order Continuous Dynamics: A Robustness Study
Qiyu Kang, Kai Zhao, Yang Song et al.
How to Use the Metropolis Algorithm for Multi-Objective Optimization?
Weijie Zheng, Mingfeng Li, Renzhong Deng et al.
Geometry-Guided Domain Generalization for Monocular 3D Object Detection
Fan Yang, Hui Chen, Yuwei He et al.
Multiscale Low-Frequency Memory Network for Improved Feature Extraction in Convolutional Neural Networks
Fuzhi Wu, Jiasong Wu, Youyong Kong et al.
Poincaré Differential Privacy for Hierarchy-Aware Graph Embedding
Yuecen Wei, Haonan Yuan, Xingcheng Fu et al.
LaneGraph2Seq: Lane Topology Extraction with Language Model via Vertex-Edge Encoding and Connectivity Enhancement
Renyuan Peng, Xinyue Cai, Hang Xu et al.
PMT: Progressive Mean Teacher via Exploring Temporal Consistency for Semi-Supervised Medical Image Segmentation
Ning Gao, Sanping Zhou, Le Wang et al.
Unraveling Batch Normalization for Realistic Test-Time Adaptation
Zixian Su, Jingwei Guo, Kai Yao et al.
Training-free Composite Scene Generation for Layout-to-Image Synthesis
Jiaqi Liu, Tao Huang, Chang Xu
FedLPS: Heterogeneous Federated Learning for Multiple Tasks with Local Parameter Sharing
Yongzhe Jia, Xuyun Zhang, Amin Beheshti et al.
VQA-Diff: Exploiting VQA and Diffusion for Zero-Shot Image-to-3D Vehicle Asset Generation in Autonomous Driving
Yibo Liu, Zheyuan Yang, Guile Wu et al.
Understanding prompt engineering may not require rethinking generalization
Victor Akinwande, Yiding Jiang, Dylan Sam et al.
Learning Deformable Hypothesis Sampling for Accurate PatchMatch Multi-View Stereo
Hongjie Li, Yao Guo, Xianwei Zheng et al.
Continuous Optical Zooming: A Benchmark for Arbitrary-Scale Image Super-Resolution in Real World
Huiyuan Fu, Fei Peng, Xianwei Li et al.
Reverse Multi-Choice Dialogue Commonsense Inference with Graph-of-Thought
Li Zheng, Hao Fei, Fei Li et al.
Joint Learning Neuronal Skeleton and Brain Circuit Topology with Permutation Invariant Encoders for Neuron Classification
Minghui Liao, Guojia Wan, Bo Du
Contributing Dimension Structure of Deep Feature for Coreset Selection
Zhijing Wan, Zhixiang Wang, Yuran Wang et al.
Surf-D: Generating High-Quality Surfaces of Arbitrary Topologies Using Diffusion Models
Zhengming Yu, Zhiyang Dou, Xiaoxiao Long et al.
Completing Priceable Committees: Utilitarian and Representation Guarantees for Proportional Multiwinner Voting
Markus Brill, Jannik Peters
Spectrum AUC Difference (SAUCD): Human-aligned 3D Shape Evaluation
Tianyu Luan, Zhong Li, Lele Chen et al.
Relaxing the Additivity Constraints in Decentralized No-Regret High-Dimensional Bayesian Optimization
Anthony Bardou, Patrick Thiran, Thomas Begin
11293 Cross-Class Feature Augmentation for Class Incremental Learning
Taehoon Kim, JaeYoo Park, Bohyung Han
MobileInst: Video Instance Segmentation on the Mobile
Renhong Zhang, Tianheng Cheng, Shusheng Yang et al.
Delivering Inflated Explanations
Yacine Izza, Alexey Ignatiev, Peter Stuckey et al.
2043 Improved MLP Point Cloud Processing with High-Dimensional Positional Encoding
Yanmei Zou, Hongshan Yu, Zhengeng Yang et al.
Not All Tasks Are Equally Difficult: Multi-Task Deep Reinforcement Learning with Dynamic Depth Routing
Jinmin He, Kai Li, Yifan Zang et al.
Domain Generalization with Vital Phase Augmentation
Ingyun Lee, WooJu Lee, Hyun Myung
Scalable Real-Time Recurrent Learning Using Columnar-Constructive Networks
Khurram Javed, Haseeb Shah, Richard Sutton et al.
Learn from the Learnt: Source-Free Active Domain Adaptation via Contrastive Sampling and Visual Persistence
Mengyao Lyu, Tianxiang Hao, Xinhao Xu et al.
Exact ASP Counting with Compact Encodings
Mohimenul Kabir, Supratik Chakraborty, Kuldeep S Meel
Arbitrary-Scale Point Cloud Upsampling by Voxel-Based Network with Latent Geometric-Consistent Learning
Hang Du, Xuejun Yan, Jingjing Wang et al.
HiHPQ: Hierarchical Hyperbolic Product Quantization for Unsupervised Image Retrieval
Zexuan Qiu, Jiahong Liu, Yankai Chen et al.
Parsing All Adverse Scenes: Severity-Aware Semantic Segmentation with Mask-Enhanced Cross-Domain Consistency
Fuhao Li, Ziyang Gong, Yupeng Deng et al.
PTMQ: Post-training Multi-Bit Quantization of Neural Networks
Ke Xu, Zhongcheng Li, Shanshan Wang et al.
Zero-Painter: Training-Free Layout Control for Text-to-Image Synthesis
Marianna Ohanyan, Hayk Manukyan, Zhangyang Wang et al.
EgoPet: Egomotion and Interaction Data from an Animal's Perspective
Amir Bar, Arya Bakhtiar, Danny L Tran et al.
OrthCaps: An Orthogonal CapsNet with Sparse Attention Routing and Pruning
Geng Xinyu, Jiaming Wang, Jiawei Gong et al.
Graph Neural Network Causal Explanation via Neural Causal Models
Arman Behnam, Binghui Wang
Learning Decentralized Partially Observable Mean Field Control for Artificial Collective Behavior
Kai Cui, Sascha Hauck, Christian Fabian et al.
LUWA Dataset: Learning Lithic Use-Wear Analysis on Microscopic Images
Jing Zhang, Irving Fang, Hao Wu et al.
ContextSeg: Sketch Semantic Segmentation by Querying the Context with Attention
Jiawei Wang, Changjian Li
Let All Be Whitened: Multi-Teacher Distillation for Efficient Visual Retrieval
Zhe Ma, Jianfeng Dong, Shouling Ji et al.
BOK-VQA: Bilingual outside Knowledge-Based Visual Question Answering via Graph Representation Pretraining
Minjun Kim, SeungWoo Song, Youhan Lee et al.
Multi-Dimensional Fair Federated Learning
Cong Su, Guoxian Yu, Jun Wang et al.
Semi-supervised Class-Agnostic Motion Prediction with Pseudo Label Regeneration and BEVMix
Kewei Wang, Yizheng Wu, Zhiyu Pan et al.
Cross-Modal Match for Language Conditioned 3D Object Grounding
Yachao Zhang, Runze Hu, Ronghui Li et al.
Synergistic Global-space Camera and Human Reconstruction from Videos
Yizhou Zhao, Tuanfeng Y. Wang, Bhiksha Raj et al.
VLN-Video: Utilizing Driving Videos for Outdoor Vision-and-Language Navigation
Jialu Li, Aishwarya Padmakumar, Gaurav Sukhatme et al.
MemoNav: Working Memory Model for Visual Navigation
Hongxin Li, Zeyu Wang, Xu Yang et al.
A Theory of Joint Light and Heat Transport for Lambertian Scenes
Mani Ramanagopal, Sriram Narayanan, Aswin C. Sankaranarayanan et al.
Integrating Efficient Optimal Transport and Functional Maps For Unsupervised Shape Correspondence Learning
Tung Le, Khai Nguyen, Shanlin Sun et al.
Multi-View Dynamic Reflection Prior for Video Glass Surface Detection
Fang Liu, Yuhao Liu, Jiaying Lin et al.
Skews in the Phenomenon Space Hinder Generalization in Text-to-Image Generation
Yingshan Chang, Yasi Zhang, Zhiyuan Fang et al.
RepVF: A Unified Vector Fields Representation for Multi-task 3D Perception
Shen Jianbing, Chunliang Li, Wencheng Han et al.
CloudFixer: Test-Time Adaptation for 3D Point Clouds via Diffusion-Guided Geometric Transformation
Hajin Shim, Changhun Kim, Eunho Yang
Open-Set Recognition in the Age of Vision-Language Models
Dimity Miller, Niko Suenderhauf, Alex Kenna et al.
Part2Object: Hierarchical Unsupervised 3D Instance Segmentation
cheng Shi, Yulin zhang, Bin Yang et al.
REVISION: Rendering Tools Enable Spatial Fidelity in Vision-Language Models
Agneet Chatterjee, Yiran Luo, Tejas Gokhale et al.
Mixture of Weak and Strong Experts on Graphs
Hanqing Zeng, Hanjia Lyu, Diyi Hu et al.
X-Former: Unifying Contrastive and Reconstruction Learning for MLLMs
Swetha Sirnam, Jinyu Yang, Tal Neiman et al.
Post-hoc bias scoring is optimal for fair classification
Wenlong Chen, Yegor Klochkov, Yang Liu
GarmentAligner: Text-to-Garment Generation via Retrieval-augmented Multi-level Corrections
Shiyue Zhang, Zheng Chong, Xujie Zhang et al.
Identification of Necessary Semantic Undertakers in the Causal View for Image-Text Matching
Huatian Zhang, Lei Zhang, Kun Zhang et al.
CoLeaF: A Contrastive-Collaborative Learning Framework for Weakly Supervised Audio-Visual Video Parsing
Faegheh Sardari, Armin Mustafa, Philip JB Jackson et al.
Anchor-based Robust Finetuning of Vision-Language Models
Jinwei Han, Zhiwen Lin, Zhongyisun Sun et al.
CatFormer: Category-Level 6D Object Pose Estimation with Transformer
Sheng Yu, Dihua Zhai, Yuanqing Xia
Temporal-Mapping Photography for Event Cameras
Yuhan Bao, Lei Sun, Yuqin Ma et al.
Multilinear Operator Networks
Yixin Cheng, Grigorios Chrysos, Markos Georgopoulos et al.
Data-Efficient Multimodal Fusion on a Single GPU
Noël Vouitsis, Zhaoyan Liu, Satya Krishna Gorti et al.
Self-Training Based Few-Shot Node Classification by Knowledge Distillation
Zongqian Wu, Yujie Mo, Peng Zhou et al.
Exploiting Semantic Reconstruction to Mitigate Hallucinations in Vision-Language Models
Minchan Kim, Minyeong Kim, Junik Bae et al.
SC-NeuS: Consistent Neural Surface Reconstruction from Sparse and Noisy Views
Shi-Sheng Huang, Zixin Zou, Yichi Zhang et al.
Learning Over Molecular Conformer Ensembles: Datasets and Benchmarks
Yanqiao Zhu, Jeehyun Hwang, Keir Adams et al.
Learning to Compose: Improving Object Centric Learning by Injecting Compositionality
Whie Jung, Jaehoon Yoo, Sungjin Ahn et al.
Distributionally Robust Loss for Long-Tailed Multi-Label Image Classification
Dekun Lin, Zhe Cui, Rui Chen et al.
BSNet: Box-Supervised Simulation-assisted Mean Teacher for 3D Instance Segmentation
Jiahao Lu, Jiacheng Deng, Tianzhu Zhang
CoSIGN: Few-Step Guidance of ConSIstency Model to Solve General INverse Problems
Jiankun Zhao, Bowen Song, Liyue Shen
LAMPAT: Low-Rank Adaption for Multilingual Paraphrasing Using Adversarial Training
Khoi M. Le, Trinh Pham, Tho Quan et al.
Efficient Privacy-Preserving Visual Localization Using 3D Ray Clouds
Heejoon Moon, Chunghwan Lee, Je Hyeong Hong
LTA-PCS: Learnable Task-Agnostic Point Cloud Sampling
Jiaheng Liu, Jianhao Li, Kaisiyuan Wang et al.
Propose, Assess, Search: Harnessing LLMs for Goal-Oriented Planning in Instructional Videos
Mohaiminul Islam, Tushar Nagarajan, Huiyu Wang et al.
B-spine: Learning B-spline Curve Representation for Robust and Interpretable Spinal Curvature Estimation
Hao Wang, Qiang Song, Ruofeng Yin et al.
Hiding Imperceptible Noise in Curvature-Aware Patches for 3D Point Cloud Attack
Mingyu Yang, Daizong Liu, Keke Tang et al.
Unrolled Decomposed Unpaired Learning for Controllable Low-Light Video Enhancement
Lingyu Zhu, Wenhan Yang, Baoliang Chen et al.
Patched Line Segment Learning for Vector Road Mapping
Jiakun Xu, Bowen Xu, Gui-Song Xia et al.
Demystifying Poisoning Backdoor Attacks from a Statistical Perspective
Ganghua Wang, Xun Xian, Ashish Kundu et al.
Federated Causal Discovery from Heterogeneous Data
Loka Li, Ignavier Ng, Gongxu Luo et al.
Hierarchical Correlation Clustering and Tree Preserving Embedding
Morteza Haghir Chehreghani, Mostafa Haghir Chehreghani
On Pretraining Data Diversity for Self-Supervised Learning
Hasan Abed El Kader Hammoud, Tuhin Das, Fabio Pizzati et al.
Bi-TTA: Bidirectional Test-Time Adapter for Remote Physiological Measurement
Haodong LI, Hao LU, Yingcong Chen
Naturally Supervised 3D Visual Grounding with Language-Regularized Concept Learners
Chun Feng, Joy Hsu, Weiyu Liu et al.
Learning Semantic Latent Directions for Accurate and Controllable Human Motion Prediction
Guowei Xu, Jiale Tao, Wen Li et al.