Om AI Lab

company

https://github.com/om-ai-lab

AI & ML interests

Multimodal AI, VLM, VLA, VAM, etc

Recent Activity

Heting updated a model about 14 hours ago

omlab/OmTrackVLA-0.6B

P3ngLiu updated a collection about 14 hours ago

OmDet-Turbo-Models

P3ngLiu updated a model about 16 hours ago

omlab/VLM-FO1-3B-v01

View all activity

Papers

Which Pretraining Paradigm Better Serves Spatial Intelligence? An Empirical Comparison of Vision-Language and Video Generation Models

VLM-FO1: Bridging the Gap Between High-Level Reasoning and Fine-Grained Perception in VLMs

View all Papers

Articles

Trials, Errors, and Breakthroughs: Our Rocky Road to OVD SOTA with Reinforcement Learning

Improving Object Detection through Reinforcement Learning with VLM-R1

omlab 's papers 2

Submitted by

Tony Zhao

Which Pretraining Paradigm Better Serves Spatial Intelligence? An Empirical Comparison of Vision-Language and Video Generation Models

omlab

Submitted by

Tony Zhao

VLM-FO1: Bridging the Gap Between High-Level Reasoning and Fine-Grained Perception in VLMs

omlab