α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Yi Jiang
Yi Jiang
26
papers
564
total citations
papers (26)
Infinity∞: Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis
CVPR 2025
arXiv
189
citations
TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation
CVPR 2025
arXiv
120
citations
Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models
ECCV 2024
arXiv
107
citations
General Object Foundation Model for Images and Videos at Scale
CVPR 2024
arXiv
79
citations
Goku: Flow Based Video Generative Foundation Models
CVPR 2025
arXiv
53
citations
Enhancing Adversarial Transferability with Adversarial Weight Tuning
AAAI 2025
arXiv
8
citations
InfinityStar: Unified Spacetime AutoRegressive Modeling for Visual Generation
NeurIPS 2025
5
citations
SA-Occ: Satellite-Assisted 3D Occupancy Prediction in Real World
ICCV 2025
arXiv
3
citations
A Unified Environmental Network for Pedestrian Trajectory Prediction
AAAI 2024
0
citations
Generative Region-Language Pretraining for Open-Ended Object Detection
CVPR 2024
arXiv
0
citations
Learning to Segment the Tail
CVPR 2020
arXiv
0
citations
Sparse R-CNN: End-to-End Object Detection With Learnable Proposals
CVPR 2021
0
citations
Language As Queries for Referring Video Object Segmentation
CVPR 2022
arXiv
0
citations
DanceTrack: Multi-Object Tracking in Uniform Appearance and Diverse Motion
CVPR 2022
arXiv
0
citations
InstMove: Instance Motion for Object-Centric Video Segmentation
CVPR 2023
arXiv
0
citations
Universal Instance Perception As Object Discovery and Retrieval
CVPR 2023
arXiv
0
citations
EGC: Image Generation and Classification via a Diffusion Energy-Based Model
ICCV 2023
arXiv
0
citations
Segment Every Reference Object in Spatial and Temporal Spaces
ICCV 2023
0
citations
Exploring Transformers for Open-world Instance Segmentation
ICCV 2023
arXiv
0
citations
Towards Grand Unification of Object Tracking
ECCV 2022
0
citations
ByteTrack: Multi-Object Tracking by Associating Every Detection Box
ECCV 2022
0
citations
SeqFormer: Sequential Transformer for Video Instance Segmentation
ECCV 2022
0
citations
In Defense of Online Models for Video Instance Segmentation
ECCV 2022
0
citations
Multimodal Transformer with Variable-Length Memory for Vision-and-Language Navigation
ECCV 2022
0
citations
Rethinking Resolution in the Context of Efficient Video Recognition
NeurIPS 2022
arXiv
0
citations
CoDet: Co-occurrence Guided Region-Word Alignment for Open-Vocabulary Object Detection
NeurIPS 2023
arXiv
0
citations