Fan Ma

15

Papers

200

Total Citations

Papers (15)

MIGC: Multi-Instance Generation Controller for Text-to-Image Synthesis

Knowledge-Enhanced Dual-stream Zero-shot Composed Image Retrieval

LSK3DNet: Towards Effective and Efficient 3D Perception with Large Sparse Kernels

Autonomous LLM-Enhanced Adversarial Attack for Text-to-Motion

Clustering for Protein Representation Learning

Imagine and Seek: Improving Composed Image Retrieval with an Imagined Proxy

Image Regeneration: Evaluating Text-to-Image Model via Generating Identical Image with Multimodal Large Language Models

Zero-1-to-A: Zero-Shot One Image to Animatable Head Avatars Using Video Diffusion

VISTA-LLAMA: Reducing Hallucination in Video Language Models via Equal Distance to Visual Tokens

From Trial to Triumph: Advancing Long Video Understanding via Visual Context Sample Scaling and Self-reward Alignment

InfiniDreamer: Arbitrarily Long Human Motion Generation via Segment Score Distillation

BrainGuard: Privacy-Preserving Multisubject Image Reconstructions from Brain Activities

Stitching Segments and Sentences towards Generalization in Video-Text Pre-training

CapHuman: Capture Your Moments in Parallel Universes

Psychometry: An Omnifit Model for Image Reconstruction from Human Brain Activity