Ser-Nam Lim

43
Papers
253
Total Citations

Papers (43)

On the Robustness of Large Multimodal Models Against Image Adversarial Attacks

CVPR 2024arXiv
80
citations

Jack of All Tasks Master of Many: Designing General-Purpose Coarse-to-Fine Vision-Language Model

CVPR 2024arXiv
50
citations

Few-Shot Object Detection with Foundation Models

CVPR 2024
50
citations

Visual Delta Generator with Large Multi-modal Models for Semi-supervised Composed Image Retrieval

CVPR 2024arXiv
21
citations

Unveiling the Ignorance of MLLMs: Seeing Clearly, Answering Incorrectly

CVPR 2025arXiv
19
citations

Composing Object Relations and Attributes for Image-Text Matching

CVPR 2024arXiv
18
citations

Fast Encoding and Decoding for Implicit Video Representation

ECCV 2024arXiv
7
citations

Model Reveals What to Cache: Profiling-Based Feature Reuse for Video Diffusion Models

ICCV 2025
6
citations

Towards Cross-modal Backward-compatible Representation Learning for Vision-Language Models

ICCV 2025arXiv
2
citations

Generative Zero-Shot Composed Image Retrieval

CVPR 2025
0
citations

DreamDance: Animating Human Images by Enriching 3D Geometry Cues from 2D Poses

ICCV 2025arXiv
0
citations

UniMODE: Unified Monocular 3D Object Detection

CVPR 2024
0
citations

MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding

CVPR 2024
0
citations

Object Recognition as Next Token Prediction

CVPR 2024
0
citations

One-Shot Domain Adaptation for Face Generation

CVPR 2020arXiv
0
citations

Intentonomy: A Dataset and Study Towards Human Intent Understanding

CVPR 2021arXiv
0
citations

Efficient Object Embedding for Spliced Image Retrieval

CVPR 2021arXiv
0
citations

On Feature Normalization and Data Augmentation

CVPR 2021arXiv
0
citations

ObjectFormer for Image Manipulation Detection and Localization

CVPR 2022arXiv
0
citations

AdaViT: Adaptive Vision Transformers for Efficient Image Recognition

CVPR 2022arXiv
0
citations

Towards Scalable Neural Representation for Diverse Videos

CVPR 2023arXiv
0
citations

Open Vocabulary Semantic Segmentation With Patch Aligned Contrastive Learning

CVPR 2023arXiv
0
citations

TIPI: Test Time Adaptation With Transformation Invariance

CVPR 2023
0
citations

Detecting Everything in the Open World: Towards Universal Object Detection

CVPR 2023arXiv
0
citations

Computationally Budgeted Continual Learning: What Does Matter?

CVPR 2023arXiv
0
citations

HNeRV: A Hybrid Neural Representation for Videos

CVPR 2023arXiv
0
citations

Exploring Visual Engagement Signals for Representation Learning

ICCV 2021arXiv
0
citations

Joint Audio-Visual Deepfake Detection

ICCV 2021
0
citations

Deep Co-Training With Task Decomposition for Semi-Supervised Domain Adaptation

ICCV 2021arXiv
0
citations

Robustness and Generalization via Generative Adversarial Training

ICCV 2021arXiv
0
citations

Open-vocabulary Panoptic Segmentation with Embedding Modulation

ICCV 2023arXiv
0
citations

BT^2: Backward-compatible Training with Basis Transformation

ICCV 2023
0
citations

Rapid Adaptation in Online Continual Learning: Are We Evaluating It Right?

ICCV 2023arXiv
0
citations

Making an Invisibility Cloak: Real World Adversarial Attacks on Object Detectors

ECCV 2020
0
citations

Quantization Guided JPEG Artifact Correction

ECCV 2020
0
citations

Curriculum Manager for Source Selection in Multi-Source Domain Adaptation

ECCV 2020
0
citations

A Metric Learning Reality Check

ECCV 2020
0
citations

What makes fake images detectable? Understanding properties that generalize

ECCV 2020
0
citations

Teaching with Soft Label Smoothing for Mitigating Noisy Labels in Facial Expressions

ECCV 2022
0
citations

Totems: Physical Objects for Verifying Visual Integrity

ECCV 2022
0
citations

MTFormer: Multi-task Learning via Transformer and Cross-Task Reasoning

ECCV 2022
0
citations

Visual Prompt Tuning

ECCV 2022
0
citations

Object-Centric Unsupervised Image Captioning

ECCV 2022
0
citations