impresso-project/wiki_comparable_corpus_en_de_hi_it_ko_zh Viewer • Updated Feb 6 • 69.2k • 43 • 1
impresso-project/ner-stacked-bert-multilingual-v1.1.0 Token Classification • 42.1M • Updated Feb 5 • 14 • 2
Running Multilingual Named Entity Recognition 👻 Multilingual Named Entity Recognition in Historical Data
impresso-project/histlux_ocr_error_denoising_lrec Sentence Similarity • 0.3B • Updated Feb 3 • 18 • 1
impresso-project/OCR-robust-gte-multilingual-base Sentence Similarity • 0.3B • Updated Oct 23, 2025 • 72
impresso-project/histlux_ocr_error_denoising_lrec Sentence Similarity • 0.3B • Updated Feb 3 • 18 • 1
impresso-project/histlux-paraphrase-multilingual-mpnet-base-v2 Sentence Similarity • 0.3B • Updated Jul 20, 2025 • 4
impresso-project/histlux-gte-multilingual-base Sentence Similarity • 0.3B • Updated Jul 20, 2025 • 26
impresso-project/OCR-robust-gte-multilingual-base Sentence Similarity • 0.3B • Updated Oct 23, 2025 • 72
impresso-project/histlux-gte-multilingual-base Sentence Similarity • 0.3B • Updated Jul 20, 2025 • 26
impresso-project/histlux-paraphrase-multilingual-mpnet-base-v2 Sentence Similarity • 0.3B • Updated Jul 20, 2025 • 4
MMTEB: Massive Multilingual Text Embedding Benchmark Paper • 2502.13595 • Published Feb 19, 2025 • 47
PARAPHRASUS : A Comprehensive Benchmark for Evaluating Paraphrase Detection Models Paper • 2409.12060 • Published Sep 18, 2024