ONE-PEACE Multimodal Retrieval
Retrieve images using audio, text, or both
Pretraining, Multimodality, NLP, CV, etc.
Welcome to OFA-Sys! I hope you enjoy our multimodal models and relevant spaces!
We aim for building a unified multimodal multitask AI system. Toward this goal, we have recently developed a series of multimodal pretrained models, e.g., OFA, Chinese-CLIP, M6, etc. Notably, OFA is a step towards "One For All", as it is a unified multimodal pretrained model that can transfer to a number of downstream tasks effectively (SOTA performance in 2022). For more information, feel free visit our github repo: https://github.com/OFA-Sys
Under this organization, we provide demos of OFA through HF Spaces, and also our models adaptive to HF Transformers.
Retrieve images using audio, text, or both
Answer questions about images