V-JEPA 2 ViT-L Cell Cycle Classification

This repo is the planned home for the V-JEPA 2 ViT-L fine-tuned for cell cycle state prediction on CTC Fluo-N2DH-GOWT1. Fine-tuning has not yet run; only the model card lives here.

For the trained baseline (U-Net + BiLSTM), see DnaRnaProteins/unet-bilstm-cell-cycle-baseline.

Plan when training lands

Single-channel microscopy input (3-channel patch embedding weights summed to 1-channel)
Encoder frozen for 10 warmup epochs, then last 4 of 24 ViT-L blocks unfrozen, lr 1e-5
Classification head: temporal mean-pool + MLP over patch tokens, 3-way output

Base model reference

Bardes et al., "Revisiting Feature Prediction for Learning Visual Representations from Video." arXiv:2404.08471. V-JEPA 2 released at ICLR 2026.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

Video Classification

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train DnaRnaProteins/vjepa2-cell-cycle-vit-l

Paper for DnaRnaProteins/vjepa2-cell-cycle-vit-l

Revisiting Feature Prediction for Learning Visual Representations from Video

Paper • 2404.08471 • Published Feb 15, 2024 • 1