Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up

Om AI Lab

company
https://github.com/om-ai-lab
OmAI_lab
om-ai-lab
Activity Feed

AI & ML interests

Multimodal AI, VLM, VLA, VAM, etc

Recent Activity

Heting  updated a model about 14 hours ago
omlab/OmTrackVLA-0.6B
P3ngLiu  updated a collection about 14 hours ago
OmDet-Turbo-Models
P3ngLiu  updated a model about 16 hours ago
omlab/VLM-FO1-3B-v01
View all activity

Papers

Which Pretraining Paradigm Better Serves Spatial Intelligence? An Empirical Comparison of Vision-Language and Video Generation Models

VLM-FO1: Bridging the Gap Between High-Level Reasoning and Fine-Grained Perception in VLMs

View all Papers

Articles

Trials, Errors, and Breakthroughs: Our Rocky Road to OVD SOTA with Reinforcement Learning

Mar 25, 2025
• 2

Improving Object Detection through Reinforcement Learning with VLM-R1

Mar 25, 2025
• 3

Tony Zhao's profile picturePeng Liu's profile pictureZilun's profile pictureKyusong Lee's profile pictureQianqian's profile pictureYing's profile picture
omlab 's papers 2
Submitted by
Tony Zhao
25

Which Pretraining Paradigm Better Serves Spatial Intelligence? An Empirical Comparison of Vision-Language and Video Generation Models

omlab Om AI Lab
10 2
Submitted by
Tony Zhao
6

VLM-FO1: Bridging the Gap Between High-Level Reasoning and Fine-Grained Perception in VLMs

omlab Om AI Lab
2
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs