This model is in an intermediate stage, but you're welcome to try it out

Training Details

Initialization: Initialized from Qwen/Qwen3-Embedding-0.6B.
Architecture Adjustments: Decoder layers reduced to 10. The reduction approach will be detailed in a blog post once we have a stable model.
Teacher Model: Qwen3-Embedding-0.6B

Evaluation

MTEB English / Models	Param.	Mean(Task)	Mean(Type)	Class.	Clust.	Pair Class.	Rerank.	Retri.	STS	Summ.
multilingual-e5-large-instruct	0.6B	65.53	61.21	75.54	49.89	86.24	48.74	53.47	84.72	29.89
NV-Embed-v2	7.8B	69.81	65.00	87.19	47.66	88.69	49.61	62.84	83.82	35.21
GritLM-7B	7.2B	67.07	63.22	81.25	50.82	87.29	49.59	54.95	83.03	35.65
gte-Qwen2-1.5B-instruct	1.5B	67.20	63.26	85.84	53.54	87.52	49.25	50.25	82.51	33.94
stella_en_1.5B_v5	1.5B	69.43	65.32	89.38	57.06	88.02	50.19	52.42	83.27	36.91
gte-Qwen2-7B-instruct	7.6B	70.72	65.77	88.52	58.97	85.9	50.47	58.09	82.69	35.74
gemini-embedding-exp-03-07	-	73.3	67.67	90.05	59.39	87.7	48.59	64.35	85.29	38.28
Qwen3-Embedding-0.6B (Teacher)	0.6B	70.70	64.88	85.76	54.05	84.37	48.18	61.83	86.57	33.43
Tarka-AIR/Tarka-Embedding-300M-V1-Preview	0.3B	66.27	61.42	83.43	52.23	82.06	45.27	51.8	82.75	32.41