Most Cited AAAI Oral "logit distillation" Papers
5,317 papers found • Page 1 of 27
Conference
T2I-Adapter: Learning Adapters to Dig Out More Controllable Ability for Text-to-Image Diffusion
Chong Mou, Xintao Wang, Liangbin Xie et al.
Benchmarking Large Language Models in Retrieval-Augmented Generation
Jiawei Chen, Hongyu Lin, Xianpei Han et al.
Preference Ranking Optimization for Human Alignment
Feifan Song, Bowen Yu, Minghao Li et al.
FigStep: Jailbreaking Large Vision-Language Models via Typographic Visual Prompts
Yichen Gong, Delong Ran, Jinyuan Liu et al.
Follow Your Pose: Pose-Guided Text-to-Video Generation Using Pose-Free Videos
Yue Ma, Yingqing HE, Xiaodong Cun et al.
NavGPT: Explicit Reasoning in Vision-and-Language Navigation with Large Language Models
Gengze Zhou, Yicong Hong, Qi Wu
NuScenes-QA: A Multi-Modal Visual Question Answering Benchmark for Autonomous Driving
Tianwen Qian, Jingjing Chen, Linhai Zhuo et al.
MedSegDiff-V2: Diffusion-based Medical Image Segmentation with Transformer
Junde Wu, Wei Ji, Huazhu Fu et al.
Detecting and Preventing Hallucinations in Large Vision Language Models
Anisha Gunjal, Jihan Yin, Erhan Bas
AnomalyGPT: Detecting Industrial Anomalies Using Large Vision-Language Models
Zhaopeng Gu, Bingke Zhu, Guibo Zhu et al.
Omni-Kernel Network for Image Restoration
Yuning Cui, Wenqi Ren, Alois Knoll
Knowledge Graph Prompting for Multi-Document Question Answering
Yu Wang, Nedim Lipka, Ryan A. Rossi et al.
Zhongjing: Enhancing the Chinese Medical Capabilities of Large Language Model through Expert Feedback and Real-World Multi-Turn Dialogue
Songhua Yang, Hanjie Zhao, Senbin Zhu et al.
BLIVA: A Simple Multimodal LLM for Better Handling of Text-Rich Visual Questions
Wenbo Hu, Yifan Xu, Yi Li et al.
MSGNet: Learning Multi-Scale Inter-series Correlations for Multivariate Time Series Forecasting
Wanlin Cai, Yuxuan Liang, Xianggen Liu et al.
ODTrack: Online Dense Temporal Token Learning for Visual Tracking
Yaozong Zheng, Bineng Zhong, Qihua Liang et al.
Fast Machine Unlearning without Retraining through Selective Synaptic Dampening
Jack Foster, Stefan Schoepf, Alexandra Brintrup
VadCLIP: Adapting Vision-Language Models for Weakly Supervised Video Anomaly Detection
Peng Wu, Xuerong Zhou, Guansong Pang et al.
ResDiff: Combining CNN and Diffusion Model for Image Super-resolution
Shuyao Shang, Zhengyang Shan, Guangxing Liu et al.
SatCLIP: Global, General-Purpose Location Embeddings with Satellite Imagery
Konstantin Klemmer, Esther Rolf, Caleb Robinson et al.
Task Contamination: Language Models May Not Be Few-Shot Anymore
Changmao Li, Jeffrey Flanigan
SciEval: A Multi-Level Large Language Model Evaluation Benchmark for Scientific Research
Liangtai Sun, Yang Han, Zihan Zhao et al.
SCTNet: Single Branch CNN with Transformer Semantic Information for Real-Time Segmentation
Authors: Zhengze Xu, Dongyue Wu, Changqian Yu et al.
FlexKBQA: A Flexible LLM-Powered Framework for Few-Shot Knowledge Base Question Answering
Zhenyu Li, Sunqi Fan, Yu Gu et al.
C3oT: Generating Shorter Chain-of-Thought Without Compromising Effectiveness
Yu Kang, Xianghui Sun, Liangyu Chen et al.
Pinwheel-shaped Convolution and Scale-based Dynamic Loss for Infrared Small Target Detection
Jiangnan Yang, Shuangli Liu, Jingjun Wu et al.
SurgicalSAM: Efficient Class Promptable Surgical Instrument Segmentation
Wenxi Yue, Jing Zhang, Kun Hu et al.
Mitigating Large Language Model Hallucinations via Autonomous Knowledge Graph-Based Retrofitting
Xinyan Guan, Yanjiang Liu, Hongyu Lin et al.
Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient Inference
Han Zhao, Min Zhang, Wei Zhao et al.
Exploring Large Language Model for Graph Data Understanding in Online Job Recommendations
Likang Wu, Zhaopeng Qiu, Zhi Zheng et al.
TimesURL: Self-Supervised Contrastive Learning for Universal Time Series Representation Learning
jiexi Liu, Songcan Chen
Fully-Connected Spatial-Temporal Graph for Multivariate Time-Series Data
Yucheng Wang, Yuecong Xu, Jianfei Yang et al.
OWQ: Outlier-Aware Weight Quantization for Efficient Fine-Tuning and Inference of Large Language Models
Changhun Lee, Jungyu Jin, Taesu Kim et al.
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering
Xianjie Wu, Jian Yang, Linzheng Chai et al.
An Attentive Inductive Bias for Sequential Recommendation beyond the Self-Attention
Yehjin Shin, Jeongwhan Choi, Hyowon Wi et al.
Rolling-Unet: Revitalizing MLP’s Ability to Efficiently Extract Long-Distance Dependencies for Medical Image Segmentation
Yutong Liu, Haijiang Zhu, Mengting Liu et al.
UCMCTrack: Multi-Object Tracking with Uniform Camera Motion Compensation
Kefu Yi, Kai Luo, Xiaolei Luo et al.
Fluctuation-Based Adaptive Structured Pruning for Large Language Models
Yongqi An, Xu Zhao, Tao Yu et al.
An Empirical Study of CLIP for Text-Based Person Search
Cao Min, Yang Bai, ziyin Zeng et al.
8976 PointAttN: You Only Need Attention for Point Cloud Completion
Jun Wang, Ying Cui, Dongyan Guo et al.
Compositional Text-to-Image Synthesis with Attention Map Control of Diffusion Models
Ruichen Wang, Zekang Chen, Chen Chen et al.
Value Kaleidoscope: Engaging AI with Pluralistic Human Values, Rights, and Duties
Taylor Sorensen, Liwei Jiang, Jena Hwang et al.
Decoupled Contrastive Multi-View Clustering with High-Order Random Walks
Yiding Lu, Yijie Lin, Mouxing Yang et al.
Reliable Conflictive Multi-View Learning
Cai Xu, Jiajun Si, Ziyu Guan et al.
FedDAT: An Approach for Foundation Model Finetuning in Multi-Modal Heterogeneous Federated Learning
Haokun Chen, Yao Zhang, Denis Krompass et al.
VIGC: Visual Instruction Generation and Correction
Théo Delemazure, Jérôme Lang, Grzegorz Pierczyński
VMT-Adapter: Parameter-Efficient Transfer Learning for Multi-Task Dense Scene Understanding
Yi Xin, Junlong Du, Qiang Wang et al.
Prompt-Based Distribution Alignment for Unsupervised Domain Adaptation
Shuanghao Bai, Min Zhang, Wanqi Zhou et al.
DepthFM: Fast Generative Monocular Depth Estimation with Flow Matching
Ming Gui, Johannes Schusterbauer, Ulrich Prestel et al.
Point Cloud Mamba: Point Cloud Learning via State Space Model
Tao Zhang, Haobo Yuan, Lu Qi et al.
DiffusionEdge: Diffusion Probabilistic Model for Crisp Edge Detection
Yunfan Ye, Yuhang Huang, Renjiao Yi et al.
AnalogCoder: Analog Circuit Design via Training-Free Code Generation
Yao Lai, Sungyoung Lee, Guojin Chen et al.
KAM-CoT: Knowledge Augmented Multimodal Chain-of-Thoughts Reasoning
Debjyoti Mondal, Suraj Modi, Subhadarshi Panda et al.
MetaDiff: Meta-Learning with Conditional Diffusion for Few-Shot Learning
Baoquan Zhang, Chuyao Luo, Demin Yu et al.
GLOP: Learning Global Partition and Local Construction for Solving Large-Scale Routing Problems in Real-Time
Haoran Ye, Jiarui Wang, Helan Liang et al.
RewriteLM: An Instruction-Tuned Large Language Model for Text Rewriting
Lei Shu, Liangchen Luo, Jayakumar Hoskere et al.
FontDiffuser: One-Shot Font Generation via Denoising Diffusion with Multi-Scale Content Aggregation and Style Contrastive Learning
Zhenhua Yang, Dezhi Peng, Yuxin Kong et al.
WebPilot: A Versatile and Autonomous Multi-Agent System for Web Task Execution with Strategic Exploration
Yao Zhang, Zijian Ma, Yunpu Ma et al.
Graph Neural Prompting with Large Language Models
Yijun Tian, Huan Song, Zichen Wang et al.
V2A-Mapper: A Lightweight Solution for Vision-to-Audio Generation by Connecting Foundation Models
Heng Wang, Jianbo Ma, Santiago Pascual et al.
Divide, Conquer and Combine: A Training-Free Framework for High-Resolution Image Perception in Multimodal Large Language Models
Wenbin Wang, Liang Ding, Minyan Zeng et al.
VerilogCoder: Autonomous Verilog Coding Agents with Graph-based Planning and Abstract Syntax Tree (AST)-based Waveform Tracing Tool
Chia-Tung Ho, Haoxing Ren, Brucek Khailany
ChatTime: A Unified Multimodal Time Series Foundation Model Bridging Numerical and Textual Data
Chengsen Wang, Qi Qi, Jingyu Wang et al.
Enhancing Job Recommendation through LLM-Based Generative Adversarial Networks
Yingpeng Du, Di Luo, Rui Yan et al.
Temporal Adaptive RGBT Tracking with Modality Prompt
Hongyu Wang, Xiaotao Liu, Yifan Li et al.
FedASMU: Efficient Asynchronous Federated Learning with Dynamic Staleness-Aware Model Update
Ji Liu, Juncheng Jia, Tianshi Che et al.
Exploring Post-training Quantization in LLMs from Comprehensive Study to Low Rank Compensation
Zhewei Yao, Xiaoxia Wu, Cheng Li et al.
SpikingBERT: Distilling BERT to Train Spiking Language Models Using Implicit Differentiation
Malyaban Bal, Abhronil Sengupta
Plug-In Diffusion Model for Sequential Recommendation
Haokai Ma, Ruobing Xie, Lei Meng et al.
DiT4Edit: Diffusion Transformer for Image Editing
Kunyu Feng, Yue Ma, Bingyuan Wang et al.
Learning to Rank in Generative Retrieval
Yongqi Li, Nan Yang, Liang Wang et al.
NeRF-LiDAR: Generating Realistic LiDAR Point Clouds with Neural Radiance Fields
Junge Zhang, Feihu Zhang, Shaochen Kuang et al.
Make RepVGG Greater Again: A Quantization-Aware Approach
Xuesong Nie, Yunfeng Yan, Siyuan Li et al.
Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model
Zhen Ye, Peiwen Sun, Jiahe Lei et al.
DiffusionTrack: Diffusion Model for Multi-Object Tracking
Run Luo, Zikai Song, Lintao Ma et al.
ELLA-V: Stable Neural Codec Language Modeling with Alignment-Guided Sequence Reordering
Yakun Song, Zhuo Chen, Xiaofei Wang et al.
HARDVS: Revisiting Human Activity Recognition with Dynamic Vision Sensors
Xiao Wang, Zongzhen Wu, Bo Jiang et al.
Key-Point-Driven Data Synthesis with Its Enhancement on Mathematical Reasoning
Yiming Huang, Xiao Liu, Yeyun Gong et al.
Boosting Consistency in Story Visualization with Rich-Contextual Conditional Diffusion Models
Fei Shen, Hu Ye, Sibo Liu et al.
Towards Compact 3D Representations via Point Feature Enhancement Masked Autoencoders
Yaohua Zha, Huizhen Ji, Jinmin Li et al.
HGPrompt: Bridging Homogeneous and Heterogeneous Graphs for Few-Shot Prompt Learning
Xingtong Yu, Yuan Fang, Zemin Liu et al.
Journey to the Center of the Knowledge Neurons: Discoveries of Language-Independent Knowledge Neurons and Degenerate Knowledge Neurons
Yuheng Chen, Pengfei Cao, Yubo Chen et al.
DocFormerv2: Local Features for Document Understanding
Srikar Appalaraju, Peng Tang, Qi Dong et al.
Referred by Multi-Modality: A Unified Temporal Transformer for Video Object Segmentation
Shilin Yan, Renrui Zhang, Ziyu Guo et al.
Math-PUMA: Progressive Upward Multimodal Alignment to Enhance Mathematical Reasoning
Wenwen Zhuang, Xin Huang, Xiantao Zhang et al.
Ternary Spike: Learning Ternary Spikes for Spiking Neural Networks
Yufei Guo, Yuanpei Chen, Xiaode Liu et al.
MASTER: Market-Guided Stock Transformer for Stock Price Forecasting
Tong Li, Zhaoyang Liu, Yanyan Shen et al.
PC-Conv: Unifying Homophily and Heterophily with Two-Fold Filtering
Bingheng Li, Erlin Pan, Zhao Kang
Correlation Matching Transformation Transformers for UHD Image Restoration
Cong Wang, Jinshan Pan, Wei Wang et al.
Large Language Models Are Clinical Reasoners: Reasoning-Aware Diagnosis Framework with Prompt-Generated Rationales
Taeyoon Kwon, Kai Ong, Dongjin Kang et al.
Context-I2W: Mapping Images to Context-Dependent Words for Accurate Zero-Shot Composed Image Retrieval
Yuanmin Tang, Jing Yu, Keke Gai et al.
Editing Language Model
Based Knowledge Graph Embeddings
SelfPromer: Self-Prompt Dehazing Transformers with Depth-Consistency
8137 Feiyu Zhu, Reid Simmons
SECap: Speech Emotion Captioning with Large Language Model
Yaoxun Xu, Hangting Chen, Jianwei Yu et al.
FBRT-YOLO: Faster and Better for Real-Time Aerial Image Detection
Yao Xiao, Tingfa Xu, Yu Xin et al.
DrFuse: Learning Disentangled Representation for Clinical Multi-Modal Fusion with Missing Modality and Modal Inconsistency
Wenfang Yao, Kejing Yin, William Cheung et al.
Delving into Multimodal Prompting for Fine-Grained Visual Classification
Xin Jiang, Hao Tang, Junyao Gao et al.
TEILP: Time Prediction over Knowledge Graphs via Logical Reasoning
Siheng Xiong, Yuan Yang, Ali Payani et al.
VLCounter: Text-Aware Visual Representation for Zero-Shot Object Counting
Seunggu Kang, WonJun Moon, Euiyeon Kim et al.
Watermarking Conditional Text Generation for AI Detection: Unveiling Challenges and a Semantic-Aware Watermark Remedy
Yu Fu, Deyi Xiong, Yue Dong
LatestEval: Addressing Data Contamination in Language Model Evaluation through Dynamic and Time
Sensitive Test Construction - Yucheng Li, Frank Guerin, Chenghua Lin
GAMC: An Unsupervised Method for Fake News Detection Using Graph Autoencoder with Masking
Shu Yin, Peican Zhu, Lianwei Wu et al.
SQLdepth: Generalizable Self-Supervised Fine-Structured Monocular Depth Estimation
Dong Wu, Mingmin Chi, Xuan Zang et al.
Fit and Prune: Fast and Training-free Visual Token Pruning for Multi-modal Large Language Models
Weihao Ye, Qiong Wu, Wenhao Lin et al.
Advancing Spatial Reasoning in Large Language Models: An In-Depth Evaluation and Enhancement Using the StepGame Benchmark
Fangjun Li, David C. Hogg, Anthony G. Cohn
MENTOR: Multi-level Self-supervised Learning for Multimodal Recommendation
Jinfeng Xu, Zheyu Chen, Shuo Yang et al.
Unlearning Concepts in Diffusion Model via Concept Domain Correction and Concept Preserving Gradient
Yongliang Wu, Shiji Zhou, Mingzhuo Yang et al.
TranSplat: Generalizable 3D Gaussian Splatting from Sparse Multi-View Images with Transformers
Chuanrui Zhang, Yingshuang Zou, Zhuoling Li et al.
Structure-CLIP: Towards Scene Graph Knowledge to Enhance Multi-Modal Structured Representations
Yufeng Huang, Jiji Tang, Zhuo Chen et al.
A Comprehensive Analysis of the Effectiveness of Large Language Models as Automatic Dialogue Evaluators
Chen Zhang, L. F. D’Haro, Yiming Chen et al.
Feature Fusion from Head to Tail for Long-Tailed Visual Recognition
Mengke Li, Zhikai HU, Yang Lu et al.
Calibrating Large Language Models with Sample Consistency
Qing Lyu, Kumar Shridhar, Chaitanya Malaviya et al.
LogFormer: A Pre-train and Tuning Pipeline for Log Anomaly Detection
hongcheng Guo, Jian Yang, Jiaheng Liu et al.
Language Model Can Listen While Speaking
Ziyang Ma, Yakun Song, Chenpeng Du et al.
EarthVQA: Towards Queryable Earth via Relational Reasoning-Based Remote Sensing Visual Question Answering
Junjue Wang, Zhuo Zheng, Zihang Chen et al.
V2Xum-LLM: Cross-Modal Video Summarization with Temporal Prompt Instruction Tuning
Hang Hua, Yunlong Tang, Chenliang Xu et al.
S2WAT: Image Style Transfer via Hierarchical Vision Transformer Using Strips Window Attention
Chiyu Zhang, Xiaogang Xu, Lei Wang et al.
MultiBooth: Towards Generating All Your Concepts in an Image from Text
Chenyang Zhu, Kai Li, Yue Ma et al.
Image Conductor: Precision Control for Interactive Video Synthesis
Yaowei Li, Xintao Wang, Zhaoyang Zhang et al.
Cross-Layer and Cross-Sample Feature Optimization Network for Few-Shot Fine-Grained Image Classification
Zhen-Xiang Ma, Zhen-Duo Chen, Li-Jun Zhao et al.
TOP-ReID: Multi-Spectral Object Re-identification with Token Permutation
Yuhao Wang, Xuehu Liu, Pingping Zhang et al.
Fine-Grained Prototypes Distillation for Few-Shot Object Detection
Zichen Wang, Bo Yang, Haonan Yue et al.
End-to-End Autonomous Driving Through V2X Cooperation
Haibao Yu, Wenxian Yang, Jiaru Zhong et al.
DreamStyler: Paint by Style Inversion with Text-to-Image Diffusion Models
Namhyuk Ahn, Junsoo Lee, Chunggi Lee et al.
Style2Talker: High-Resolution Talking Head Generation with Emotion Style and Art Style
Shuai Tan, Bin Ji, Ye Pan
Debiasing Multimodal Sarcasm Detection with Contrastive Learning
Mengzhao Jia, Can Xie, Liqiang Jing
Text2Analysis: A Benchmark of Table Question Answering with Advanced Data Analysis and Unclear Queries
Xinyi He, Mengyu Zhou, Xinrun Xu et al.
DLF: Disentangled-Language-Focused Multimodal Sentiment Analysis
Pan Wang, Qiang Zhou, Yawen Wu et al.
Prototypical Calibrating Ambiguous Samples for Micro-Action Recognition
Kun Li, Dan Guo, Guoliang Chen et al.
PREFER: Prompt Ensemble Learning via Feedback-Reflect-Refine
Chenrui Zhang, Lin Liu, Chuyuan Wang et al.
TC-LIF: A Two-Compartment Spiking Neuron Model for Long-Term Sequential Modelling
Shimin Zhang, Qu Yang, Chenxiang Ma et al.
Learning to Prompt with Text Only Supervision for Vision-Language Models
Muhammad Uzair Khattak, Muhammad Ferjad Naeem, Muzammal Naseer et al.
Attribute-Missing Graph Clustering Network
Wenxuan Tu, Renxiang Guan, Sihang Zhou et al.
A Diffusion-Based Framework for Multi-Class Anomaly Detection
Haoyang He, Jiangning Zhang, Hongxu Chen et al.
Towards Balanced Alignment: Modal-Enhanced Semantic Modeling for Video Moment Retrieval
Zhihang Liu, Jun Li, Hongtao Xie et al.
Devignet: High-Resolution Vignetting Removal via a Dual Aggregated Fusion Transformer with Adaptive Channel Expansion
Shenghong Luo, Xuhang Chen, Weiwen Chen et al.
HSEvo: Elevating Automatic Heuristic Design with Diversity-Driven Harmony Search and Genetic Algorithm Using LLMs
Pham Vu Tuan Dat, Long Doan, Huynh Thi Thanh Binh
RATT: A Thought Structure for Coherent and Correct LLM Reasoning
Jinghan Zhang, Xiting Wang, Weijieying Ren et al.
Multi-Architecture Multi-Expert Diffusion Models
Yunsung Lee, Jin-Young Kim, Hyojun Go et al.
No Prejudice! Fair Federated Graph Neural Networks for Personalized Recommendation
Nimesh Agrawal, Anuj Sirohi, Sandeep Kumar et al.
Towards Continual Knowledge Graph Embedding via Incremental Distillation
Jiajun Liu, Ke Wenjun, Peng Wang et al.
Controllable Mind Visual Diffusion Model
Bohan Zeng, Shanglin Li, Xuhui Liu et al.
Flash Diffusion: Accelerating Any Conditional Diffusion Model for Few Steps Image Generation
Clément Chadebec, Onur Tasar, Eyal Benaroche et al.
CoCoCo: Improving Text-Guided Video Inpainting for Better Consistency, Controllability and Compatibility
Bojia Zi, Shihao Zhao, Xianbiao Qi et al.
XKD: Cross-Modal Knowledge Distillation with Domain Alignment for Video Representation Learning
Pritam Sarkar, Ali Etemad
Frequency Spectrum Is More Effective for Multimodal Representation and Fusion: A Multimodal Spectrum Rumor Detector
An Lao, Qi Zhang, Chongyang Shi et al.
Exploiting Multimodal Spatial-temporal Patterns for Video Object Tracking
Xiantao Hu, Ying Tai, Xu Zhao et al.
Text-Guided Molecule Generation with Diffusion Language Model
Haisong Gong, Qiang Liu, Shu Wu et al.
Few Shot Part Segmentation Reveals Compositional Logic for Industrial Anomaly Detection
Soopil Kim, Sion An, Philip Chikontwe et al.
Latent Space Editing in Transformer-Based Flow Matching
Vincent Tao Hu, Wei Zhang, Meng Tang et al.
Prompting Segmentation with Sound Is Generalizable Audio-Visual Source Localizer
Yaoting Wang, Liu Weisong, Guangyao Li et al.
MathAttack: Attacking Large Language Models towards Math Solving Ability
Zihao Zhou, Qiufeng Wang, Mingyu Jin et al.
SlowTrack: Increasing the Latency of Camera-Based Perception in Autonomous Driving Using Adversarial Examples
Chen Ma, Ningfei Wang, Qi Alfred Chen et al.
SUTrack: Towards Simple and Unified Single Object Tracking
Xin Chen, Ben Kang, Wanting Geng et al.
Federated Adaptive Prompt Tuning for Multi-Domain Collaborative Learning
Shangchao Su, Mingzhao Yang, Bin Li et al.
TinySAM: Pushing the Envelope for Efficient Segment Anything Model
Han Shu, Wenshuo Li, Yehui Tang et al.
STEM: Unleashing the Power of Embeddings for Multi-Task Recommendation
Liangcai Su, Junwei Pan, Ximei Wang et al.
Evaluating the Evaluator: Measuring LLMs’ Adherence to Task Evaluation Instructions
Bhuvanashree Murugadoss, Christian Poelitz, Ian Drosos et al.
DiffBEV: Conditional Diffusion Model for Bird’s Eye View Perception
Jiayu Zou, Kun Tian, Zheng Zhu et al.
Causal Prompting: Debiasing Large Language Model Prompting Based on Front-Door Adjustment
Congzhi Zhang, Linhai Zhang, Jialong Wu et al.
U-mixer: An Unet-Mixer Architecture with Stationarity Correction for Time Series Forecasting
Xiang Ma, Xuemei Li, Lexin Fang et al.
Exploiting Label Skews in Federated Learning with Model Concatenation
Yiqun Diao, Qinbin Li, Bingsheng He
LION: Implicit Vision Prompt Tuning
Haixin Wang, Jianlong Chang, Yihang Zhai et al.
Cycle3D: High-quality and Consistent Image-to-3D Generation via Generation-Reconstruction Cycle
Zhenyu Tang, Junwu Zhang, Xinhua Cheng et al.
NeuSurf: On-Surface Priors for Neural Surface Reconstruction from Sparse Input Views
Han Huang, Yulun Wu, Junsheng Zhou et al.
SD-MVS: Segmentation-Driven Deformation Multi-View Stereo with Spherical Refinement and EM Optimization
Zhenlong Yuan, Jiakai Cao, Zhaoxin Li et al.
Mono3DVG: 3D Visual Grounding in Monocular Images
Yangfan Zhan, Yuan Yuan, Zhitong Xiong
Translate Meanings, Not Just Words: IdiomKB’s Role in Optimizing Idiomatic Translation with Language Models
Shuang Li, Jiangjie Chen, Siyu Yuan et al.
Multi-Objective Evolution of Heuristic Using Large Language Model
Shunyu Yao, Fei Liu, Xi Lin et al.
DrivingForward: Feed-forward 3D Gaussian Splatting for Driving Scene Reconstruction from Flexible Surround-view Input
Qijian Tian, Xin Tan, Yuan Xie et al.
Adv-Diffusion: Imperceptible Adversarial Face Identity Attack via Latent Diffusion Model
Decheng Liu, Xijun Wang, Chunlei Peng et al.
Multi-Modal Latent Space Learning for Chain-of-Thought Reasoning in Language Models
Liqi He, Zuchao Li, Xiantao Cai et al.
Transformer-Based No-Reference Image Quality Assessment via Supervised Contrastive Learning
Jinsong Shi, Pan Gao, Jie Qin
FairSIN: Achieving Fairness in Graph Neural Networks through Sensitive Information Neutralization
Cheng Yang, Jixi Liu, Yunhe Yan et al.
Stable-Hair: Real-World Hair Transfer via Diffusion Model
Yuxuan Zhang, Qing Zhang, Yiren Song et al.
Provably Powerful Graph Neural Networks for Directed Multigraphs
Beni Egressy, Luc von Niederhäusern, Jovan Blanuša et al.
MAGIC: Generating Self-Correction Guideline for In-Context Text-to-SQL
Arian Askari, Christian Poelitz, Xinye Tang
Token-Level Contrastive Learning with Modality-Aware Prompting for Multimodal Intent Recognition
Qianrui Zhou, Hua Xu, Hao Li et al.
Synergistic Multiscale Detail Refinement via Intrinsic Supervision for Underwater Image Enhancement
Dehuan Zhang, Jingchun Zhou, Chunle Guo et al.
SCALM: Detecting Bad Practices in Smart Contracts Through LLMs
Zongwei Li, Xiaoqi Li, Wenkai Li et al.
Point-PEFT: Parameter-Efficient Fine-Tuning for 3D Pre-trained Models
Yiwen Tang, Ray Zhang, Zoey Guo et al.
Explaining Generalization Power of a DNN Using Interactive Concepts
Huilin Zhou, Hao Zhang, Huiqi Deng et al.
Selective Forgetting: Advancing Machine Unlearning Techniques and Evaluation in Language Models
Lingzhi Wang, Xingshan Zeng, Jinsong Guo et al.
Multi-Prompts Learning with Cross-Modal Alignment for Attribute-Based Person Re-identification
Yajing Zhai, Yawen Zeng, Zhiyong Huang et al.
Concept-Guided Prompt Learning for Generalization in Vision-Language Models
Yi Zhang, Ce Zhang, Ke Yu et al.
Frequency-Adaptive Pan-Sharpening with Mixture of Experts
Xuanhua He, Keyu Yan, Rui Li et al.
Graph-Aware Contrasting for Multivariate Time-Series Classification
Yucheng Wang, Yuecong Xu, Jianfei Yang et al.
Rethinking Graph Masked Autoencoders through Alignment and Uniformity
Liang Wang, Xiang Tao, Qiang Liu et al.
Graph Invariant Learning with Subgraph Co-mixup for Out-of-Distribution Generalization
Tianrui Jia, Haoyang Li, Cheng Yang et al.
CFR-ICL: Cascade-Forward Refinement with Iterative Click Loss for Interactive Image Segmentation
Shoukun Sun, Min Xian, Fei Xu et al.
Exploring Sparse Visual Prompt for Domain Adaptive Dense Prediction
Senqiao Yang, Jiarui Wu, Jiaming Liu et al.
Attentive Eraser: Unleashing Diffusion Model’s Object Removal Potential via Self-Attention Redirection Guidance
Wenhao Sun, Xue-Mei Dong, Benlei Cui et al.
Prompt Compression with Context-Aware Sentence Encoding for Fast and Improved LLM Inference
Barys Liskavets, Maxim Ushakov, Shuvendu Roy et al.
Audio Generation with Multiple Conditional Diffusion Model
Zhifang Guo, Jianguo Mao, Tao Rui et al.
Distilling Autoregressive Models to Obtain High-Performance Non-autoregressive Solvers for Vehicle Routing Problems with Faster Inference Speed
Yubin Xiao, Di Wang, Boyang Li et al.
G2P-DDM: Generating Sign Pose Sequence from Gloss Sequence with Discrete Diffusion Model
Pan Xie, Qipeng Zhang, Peng Taiying et al.
Attention Guided CAM: Visual Explanations of Vision Transformer Guided by Self-Attention
Saebom Leem, Hyunseok Seo
Guided Real Image Dehazing Using YCbCr Color Space
Wenxuan Fang, Junkai Fan, Yu Zheng et al.
RadOcc: Learning Cross-Modality Occupancy Knowledge through Rendering Assisted Distillation
Haiming Zhang, Xu Yan, Dongfeng Bai et al.
TCI-Former: Thermal Conduction-Inspired Transformer for Infrared Small Target Detection
Tianxiang Chen, Zhentao Tan, Qi Chu et al.