This model is in an intermediate stage, but you're welcome to try it out

Training Details

  • Initialization: Initialized from Qwen/Qwen3-Embedding-0.6B.
  • Architecture Adjustments: Decoder layers reduced to 10. The reduction approach will be detailed in a blog post once we have a stable model.
  • Teacher Model: Qwen3-Embedding-0.6B

Evaluation

MTEB (Eng v2)

MTEB English / Models Param. Mean(Task) Mean(Type) Class. Clust. Pair Class. Rerank. Retri. STS Summ.
multilingual-e5-large-instruct 0.6B 65.53 61.21 75.54 49.89 86.24 48.74 53.47 84.72 29.89
NV-Embed-v2 7.8B 69.81 65.00 87.19 47.66 88.69 49.61 62.84 83.82 35.21
GritLM-7B 7.2B 67.07 63.22 81.25 50.82 87.29 49.59 54.95 83.03 35.65
gte-Qwen2-1.5B-instruct 1.5B 67.20 63.26 85.84 53.54 87.52 49.25 50.25 82.51 33.94
stella_en_1.5B_v5 1.5B 69.43 65.32 89.38 57.06 88.02 50.19 52.42 83.27 36.91
gte-Qwen2-7B-instruct 7.6B 70.72 65.77 88.52 58.97 85.9 50.47 58.09 82.69 35.74
gemini-embedding-exp-03-07 - 73.3 67.67 90.05 59.39 87.7 48.59 64.35 85.29 38.28
Qwen3-Embedding-0.6B (Teacher) 0.6B 70.70 64.88 85.76 54.05 84.37 48.18 61.83 86.57 33.43
Tarka-AIR/Tarka-Embedding-300M-V1-Preview 0.3B 66.27 61.42 83.43 52.23 82.06 45.27 51.8 82.75 32.41
Downloads last month
4
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support