| --- |
| license: apache-2.0 |
| language: |
| - en |
| - zh |
| - ru |
| - es |
| - fr |
| - de |
| - ar |
| - nl |
| - vi |
| - hi |
| - ko |
| - ja |
| - it |
| - id |
| - pt |
| - pl |
| - tr |
| - da |
| - th |
| - sv |
| - fa |
| - uk |
| - cs |
| - 'no' |
| - el |
| - ca |
| - ro |
| - fi |
| - bg |
| - tl |
| - gl |
| - my |
| - hy |
| - km |
| - ne |
| - hu |
| - eu |
| - he |
| - lo |
| - sw |
| - az |
| - lv |
| - si |
| - sk |
| - tg |
| - et |
| - lt |
| - ms |
| - hr |
| - is |
| - sl |
| - sr |
| - ur |
| - bn |
| - af |
| - ta |
| - ka |
| - te |
| - ml |
| - mn |
| - nn |
| - kk |
| - cy |
| - mr |
| - sq |
| - nb |
| - mk |
| - jv |
| - kn |
| - eo |
| - la |
| - gu |
| - uz |
| - am |
| - oc |
| - be |
| - mg |
| - vo |
| - pa |
| - lb |
| - ht |
| - br |
| - ga |
| - xh |
| - tt |
| - bs |
| - yo |
| base_model: |
| - codefuse-ai/F2LLM-v2-0.6B-Preview-Pruned-80M |
| pipeline_tag: feature-extraction |
| library_name: transformers |
| tags: |
| - sentence-transformers |
| datasets: |
| - codefuse-ai/F2LLM-v2 |
| --- |
| |
| # F2LLM-v2-80M |
|
|
| F2LLM-v2 is a family of general-purpose, multilingual embedding models in 8 distinct sizes ranging from 80M to 14B. Trained on a curated composite of 60 million publicly available high-quality data, F2LLM-v2 supports more than 200 languages, with a particular emphasis on previously underserved mid- and low-resource languages. |
|
|
| F2LLM-v2 is fully open. We release base models in 5 sizes, instruct models in 8 sizes, the training data, the training code, and intermediate checkpoints. The three smallest instruct models are pruned and trained from the 0.6B base model. |
|
|
| | Model | Base | Instruct | |
| | ----- | ----------------------------------------------------------------------------------- | ------------------------------------------------------------------- | |
| | 80M | | [🤗F2LLM-v2-80M](https://huggingface.co/codefuse-ai/F2LLM-v2-80M) | |
| | 160M | | [🤗F2LLM-v2-160M](https://huggingface.co/codefuse-ai/F2LLM-v2-160M) | |
| | 330M | | [🤗F2LLM-v2-330M](https://huggingface.co/codefuse-ai/F2LLM-v2-330M) | |
| | 0.6B | [🤗F2LLM-v2-0.6B-Preview](https://huggingface.co/codefuse-ai/F2LLM-v2-0.6B-Preview) | [🤗F2LLM-v2-0.6B](https://huggingface.co/codefuse-ai/F2LLM-v2-0.6B) | |
| | 1.7B | [🤗F2LLM-v2-1.7B-Preview](https://huggingface.co/codefuse-ai/F2LLM-v2-1.7B-Preview) | [🤗F2LLM-v2-1.7B](https://huggingface.co/codefuse-ai/F2LLM-v2-1.7B) | |
| | 4B | [🤗F2LLM-v2-4B-Preview](https://huggingface.co/codefuse-ai/F2LLM-v2-4B-Preview) | [🤗F2LLM-v2-4B](https://huggingface.co/codefuse-ai/F2LLM-v2-4B) | |
| | 8B | [🤗F2LLM-v2-8B-Preview](https://huggingface.co/codefuse-ai/F2LLM-v2-8B-Preview) | [🤗F2LLM-v2-8B](https://huggingface.co/codefuse-ai/F2LLM-v2-8B) | |
| | 14B | [🤗F2LLM-v2-14B-Preview](https://huggingface.co/codefuse-ai/F2LLM-v2-14B-Preview) | [🤗F2LLM-v2-14B](https://huggingface.co/codefuse-ai/F2LLM-v2-14B) | |
|
|
| ## Usage |
|
|
| ### With Sentence Transformers |
|
|
| To encode text with the [Sentence Transformers](https://www.sbert.net/) library: |
|
|
| ```python |
| from sentence_transformers import SentenceTransformer |
| model = SentenceTransformer("codefuse-ai/F2LLM-v2-80M", device="cuda:0", model_kwargs={"torch_dtype": "bfloat16"}) |
| # Some sample query and documents |
| query = "What is F2LLM used for?" |
| documents = [ |
| 'We present F2LLM, a family of fully open embedding LLMs that achieve a strong balance between model size, training data, and embedding performance.', |
| 'F2LLM is a model for computing text embeddings that can be used for various NLP tasks such as information retrieval, semantic search, and text classification.', |
| 'F2LLM 是 CodeFuse 开源的系列嵌入模型。', |
| 'F2LLM — это модель вычисления встраивания текста, которую можно использовать для различных задач НЛП, таких как поиск информации, семантический поиск и классификация текста.' |
| ] |
| # Encode the query and documents separately. The encode_query method uses the query prompt |
| query_embedding = model.encode_query(query) |
| document_embeddings = model.encode_document(documents) |
| print(query_embedding.shape, document_embeddings.shape) |
| # (320,) (4, 320) |
| # Compute cosine similarity between the query and documents |
| similarity = model.similarity(query_embedding, document_embeddings) |
| print(similarity) |
| # tensor([[0.6968, 0.7818, 0.7165, 0.8374]]) |
| ``` |
|
|
| ### With Transformers |
|
|
| Or directly with the [Transformers](https://huggingface.co/docs/transformers/index) library: |
|
|
| ```python |
| from transformers import AutoModel, AutoTokenizer |
| import torch |
| import torch.nn.functional as F |
| model_path = "codefuse-ai/F2LLM-v2-80M" |
| tokenizer = AutoTokenizer.from_pretrained(model_path) |
| model = AutoModel.from_pretrained(model_path, torch_dtype=torch.bfloat16, device_map={'': 0}) |
| query = "What is F2LLM used for?" |
| query_prompt = "Instruct: Given a question, retrieve passages that can help answer the question.\nQuery: " |
| documents = [ |
| 'We present F2LLM, a family of fully open embedding LLMs that achieve a strong balance between model size, training data, and embedding performance.', |
| 'F2LLM is a model for computing text embeddings that can be used for various NLP tasks such as information retrieval, semantic search, and text classification.', |
| 'F2LLM 是 CodeFuse 开源的系列嵌入模型。', |
| 'F2LLM — это модель вычисления встраивания текста, которую можно использовать для различных задач НЛП, таких как поиск информации, семантический поиск и классификация текста.' |
| ] |
| def encode(sentences): |
| batch_size = len(sentences) |
| # the tokenizer will automatically add eos token |
| tokenized_inputs = tokenizer(sentences, padding=True, return_tensors='pt').to(model.device) |
| last_hidden_state = model(**tokenized_inputs).last_hidden_state |
| eos_positions = tokenized_inputs.attention_mask.sum(dim=1) - 1 |
| embeddings = last_hidden_state[torch.arange(batch_size, device=model.device), eos_positions] |
| embeddings = F.normalize(embeddings, p=2, dim=1) |
| return embeddings |
| # Encode the query and documents |
| query_embedding = encode([query_prompt + query]) |
| document_embeddings = encode(documents) |
| print(query_embedding.shape, document_embeddings.shape) |
| # torch.Size([1, 320]) torch.Size([4, 320]) |
| # Compute cosine similarity between the query and documents |
| similarity = query_embedding @ document_embeddings.T |
| print(similarity) |
| # tensor([[0.6914, 0.7812, 0.7148, 0.8359]], device='cuda:0', |
| # dtype=torch.bfloat16, grad_fn=<MmBackward0>) |
| ``` |
|
|
| ## Intermediate Checkpoints |
|
|
| To facilitate future research, we release intermediate checkpoints in the `intermediate_checkpoints` branch. |
|
|
| ## Citation |
|
|
| ``` |
| @misc{f2llm-v2, |
| title={F2LLM-v2: Inclusive, Performant, and Efficient Embeddings for a Multilingual World}, |
| author={Ziyin Zhang and Zihan Liao and Hang Yu and Peng Di and Rui Wang}, |
| year={2026}, |
| eprint={2603.19223}, |
| archivePrefix={arXiv}, |
| primaryClass={cs.CL}, |
| url={https://arxiv.org/abs/2603.19223}, |
| } |
| ``` |