Multimodal Autoregressive Pre-training of Large Vision Encoders
Paper
• 2411.14402 • Published
• 47
timm compatible AIM-v2 (https://huggingface.co/papers/2411.14402) image encoder weights from https://huggingface.co/apple/aimv2-huge-patch14-336