Poster by Oguzhan Fatih Kar Papers
2 papers found
FlexTok: Resampling Images into 1D Token Sequences of Flexible Length
Roman Bachmann, Jesse Allardice, David Mizrahi et al.
ICML 2025posterarXiv:2502.13967
43
citations
BRAVE: Broadening the visual encoding of vision-language models
Oguzhan Fatih Kar, Alessio Tonioni, Petra Poklukar et al.
ECCV 2024posterarXiv:2404.07204