Koustuv Sinha

4

Papers

46

Total Citations

Papers (4)

Scaling Language-Free Visual Representation Learning

VEDIT: Latent Prediction Architecture For Procedural Video Representation Learning

Controlling Multimodal LLMs via Reward-guided Decoding

MetaMorph: Multimodal Understanding and Generation via Instruction Tuning