mena-open-data
's Collections
Arabic NLP datasets
updated
lightonai/nanobeir-multilingual
Viewer
•
Updated
•
522k
•
422
•
11
Viewer
•
Updated
•
47.8M
•
8.79k
•
31
Viewer
•
Updated
•
2.72k
•
7
Viewer
•
Updated
•
7.42k
•
66
•
2
Viewer
•
Updated
•
149
•
28
Viewer
•
Updated
•
4.13k
•
1.58k
•
1
Omartificial-Intelligence-Space/Arabic-NLi-Pair-Class
Viewer
•
Updated
•
981k
•
365
•
2
malaysia-ai/Multilingual-TTS
Viewer
•
Updated
•
34.2M
•
1.11k
•
13
opendatalab/WanJuanSiLu-Multimodal-5Languages
Preview
•
Updated
•
81
•
3
Preview
•
Updated
•
164
•
35
Viewer
•
Updated
•
66k
•
59
•
10
LLaMAX/BenchMAX_Function_Completion
Viewer
•
Updated
•
2.79k
•
1.05k
•
1
Viewer
•
Updated
•
8.86k
•
2.19k
•
7
Viewer
•
Updated
•
3.25M
•
58
•
3
MLCommons/ml_spoken_words
Updated
•
1.93k
•
34
Twitter/HashtagPrediction
Viewer
•
Updated
•
1.07M
•
138
•
2
Viewer
•
Updated
•
1.4M
•
198
•
1
Viewer
•
Updated
•
3.62M
•
425
•
2
Viewer
•
Updated
•
197k
•
489
•
4
Viewer
•
Updated
•
54.9k
•
3.76k
•
74
Viewer
•
Updated
•
108k
•
5.02k
•
66
Updated
•
2.74k
•
14
Viewer
•
Updated
•
624
•
197
•
4
Viewer
•
Updated
•
5.07k
•
1.18k
Viewer
•
Updated
•
13.3k
•
86
•
4
Viewer
•
Updated
•
200
•
114
Viewer
•
Updated
•
37.4k
•
354
•
4
Updated
•
915
•
4
Viewer
•
Updated
•
130k
•
133
•
2
Viewer
•
Updated
•
3.12k
•
1.18k
vg055/SemEval2025_Task11_TrackA
Viewer
•
Updated
•
2k
•
6
sarulab-speech/commonvoice22_sidon
Viewer
•
Updated
•
15.1M
•
1.45k
•
15
Preview
•
Updated
•
9
ToxicityPrompts/PolyGuardMix
Viewer
•
Updated
•
1.91M
•
207
•
4
Viewer
•
Updated
•
481k
•
54
•
15
Preview
•
Updated
•
177
•
8
Viewer
•
Updated
•
124M
•
76
•
16
linagora/linto-dataset-audio-ar-tn
Viewer
•
Updated
•
37.3k
•
1.36k
•
13
Viewer
•
Updated
•
13.6k
•
681
•
26
Viewer
•
Updated
•
676k
•
1.64k
•
35
Viewer
•
Updated
•
9.71k
•
1.15k
•
19
fr3on/election-questions-arabic
Viewer
•
Updated
•
1.49k
•
44
Updated
•
23
•
8
Viewer
•
Updated
•
3
•
8
•
1
Updated
•
284
•
21
papluca/language-identification
Viewer
•
Updated
•
90k
•
3.42k
•
61
vincentkoc/tiny_qa_benchmark_pp
Viewer
•
Updated
•
662
•
505
•
2
Viewer
•
Updated
•
70.3M
•
4.81k
•
17
Viewer
•
Updated
•
88.8k
•
8.84k
•
1.47k
Viewer
•
Updated
•
4.8k
•
13
s-nlp/EverGreen-Multilingual
Viewer
•
Updated
•
4.76k
•
53
•
1
camel-ai/ai_society_translated
Preview
•
Updated
•
104
•
16
LLaMAX/BenchMAX_Problem_Solving
Viewer
•
Updated
•
12.1k
•
271
•
1
alexandrainst/multi-wiki-qa
Viewer
•
Updated
•
1.22M
•
1.09k
•
21
SaiedAlshahrani/Moroccan_Arabic_Wikipedia_20230101_nobots
Viewer
•
Updated
•
4.68k
•
32
•
3
Melaraby/EvArEST-dataset-for-Arabic-scene-text-recognition
Viewer
•
Updated
•
296k
•
84
mozilla-foundation/common_voice_17_0
Updated
•
2.43k
•
2
suchirsalhan/Phonemized-UD
Viewer
•
Updated
•
1.19M
•
1.17k
LLMXperts/Arabic-NLi-Triplet
Viewer
•
Updated
•
571k
•
24
Updated
•
1.37k
•
3
adithya7/xlel_wd_dictionary
Viewer
•
Updated
•
230k
•
860
•
3
Viewer
•
Updated
•
10k
•
244
•
54
Viewer
•
Updated
•
86.8M
•
2.63k
•
22
Viewer
•
Updated
•
76.3k
•
4.6k
•
4
Viewer
•
Updated
•
78k
•
63
•
3
Viewer
•
Updated
•
46.2k
•
821
•
26
SaiedAlshahrani/Detect-Egyptian-Wikipedia-Articles
Viewer
•
Updated
•
756k
•
743
•
1
Omartificial-Intelligence-Space/Arabic-NLi-Pair
Viewer
•
Updated
•
328k
•
47
•
4
aida-ugent/llm-ideology-analysis
Viewer
•
Updated
•
315k
•
469
•
4
Viewer
•
Updated
•
1.2k
•
20
•
6
Viewer
•
Updated
•
206k
•
3.6k
•
331
Viewer
•
Updated
•
290k
•
433
•
42
Viewer
•
Updated
•
255k
•
107
•
5
Preview
•
Updated
•
93
•
3
tellarin-ai/ntx_llm_instructions
Viewer
•
Updated
•
5.98k
•
116
Viewer
•
Updated
•
29.2k
•
2.73k
•
34
UBC-NLP/nilechat-arabizi-mor
Viewer
•
Updated
•
1.45M
•
19
•
2
Viewer
•
Updated
•
2.14M
•
42
•
5
CohereLabs/include-lite-44
Viewer
•
Updated
•
10.8k
•
620
•
14
Viewer
•
Updated
•
3.48k
•
515
•
14
Viewer
•
Updated
•
7.35k
•
1.08k
Viewer
•
Updated
•
5.16k
•
113
•
5
JQL-AI/JQL-Human-Edu-Annotations
Viewer
•
Updated
•
20.4k
•
466
•
5
Viewer
•
Updated
•
9.03B
•
38.5k
•
36
Viewer
•
Updated
•
310k
•
1.4k
•
10
CohereLabs/fusion-pairwise-evals-finetuned
Viewer
•
Updated
•
5.25k
•
9
Viewer
•
Updated
•
400
•
42
•
7
Viewer
•
Updated
•
8.69k
•
74
•
1
faisaltareque/XL-HeadTags
Viewer
•
Updated
•
415k
•
34
•
3
Viewer
•
Updated
•
3.91M
•
502
•
6
Viewer
•
Updated
•
100
•
16
•
1
Viewer
•
Updated
•
798k
•
3.11k
•
81
Viewer
•
Updated
•
330
•
42
•
3
Viewer
•
Updated
•
94.4k
•
771
•
11
Updated
•
627
•
8
CohereLabs/fusion-synth-data-ufb
Viewer
•
Updated
•
94.7k
•
33
•
1
QCRI/AraDICE-ArabicMMLU-egy
Viewer
•
Updated
•
14.5k
•
1.45k
•
1
Viewer
•
Updated
•
121
•
80
•
3
Viewer
•
Updated
•
2.97M
•
1.77k
•
29
ClusterlabAi/101_billion_arabic_words_dataset
Viewer
•
Updated
•
33.1M
•
909
•
70
omar-emad/financesecondtrial
Viewer
•
Updated
•
30
•
8
Viewer
•
Updated
•
11.4k
•
15
Viewer
•
Updated
•
695k
•
314
•
8
CohereLabs/deja-vu-pairwise-evals
Updated
•
24
•
3
kaust-generative-ai/fineweb-edu-ar
Viewer
•
Updated
•
363M
•
87
•
13
Preview
•
Updated
•
43
•
1
Viewer
•
Updated
•
893
•
15
•
1
Viewer
•
Updated
•
135k
•
383
•
1
UBC-NLP/nilechat-arabizi-egy
Viewer
•
Updated
•
572k
•
22
Viewer
•
Updated
•
761k
•
33
•
3
Viewer
•
Updated
•
11.1k
•
100
•
5
KFUPM-JRCAI/arabic-generated-abstracts
Viewer
•
Updated
•
8.39k
•
473
Viewer
•
Updated
•
5.73k
•
158
•
6
badrex/ALDi-predictions-MADIS5
Viewer
•
Updated
•
263
•
3
Viewer
•
Updated
•
467k
•
9
•
2
Viewer
•
Updated
•
10.1k
•
62
•
1
CohereLabs/include-base-44
Viewer
•
Updated
•
23k
•
6.06k
•
43
CohereLabs/m-ArenaHard-v2.0
Viewer
•
Updated
•
11.5k
•
237
•
5
Viewer
•
Updated
•
77.2M
•
3.49k
•
51
ToxicityPrompts/PolyGuardPrompts
Viewer
•
Updated
•
29.3k
•
109
•
2
Updated
•
9.05k
•
2
SaiedAlshahrani/Egyptian_Arabic_Wikipedia_20230101
Viewer
•
Updated
•
728k
•
56
•
4
QCRI/AraDICE-ArabicMMLU-lev
Viewer
•
Updated
•
14.5k
•
1.41k
Viewer
•
Updated
•
97.6k
•
1.97k
•
47
Updated
•
809
•
12
Viewer
•
Updated
•
141k
•
41
•
7
CohereLabsCommunity/afri-aya
Viewer
•
Updated
•
2.47k
•
174
•
11
Omar-youssef/Egyptian-text-summarization
Viewer
•
Updated
•
3.69k
•
22
jonathanmutal/Medical-Questionnaire-Multilingual-Translation
Preview
•
Updated
•
10
Updated
•
44.5k
•
41
CohereLabs/Global-MMLU-Lite
Viewer
•
Updated
•
10.9k
•
5.68k
•
28
MBZUAI/speecht5_tts_clartts_ar
Text-to-Speech
•
Updated
•
1.75k
•
25
LLaMAX/BenchMAX_General_Translation
Viewer
•
Updated
•
228k
•
421
abdullah-alamodi/aqeedah-rag-dataset
Viewer
•
Updated
•
5.42k
•
27
•
1
Viewer
•
Updated
•
63.8k
•
304
•
1
Viewer
•
Updated
•
127k
•
954
•
27
Viewer
•
Updated
•
5.1M
•
1.3k
•
47
sboughorbel/arabic-web-edu-seed
Viewer
•
Updated
•
236k
•
72
•
3
amphora/Open-R1-Mulitlingual-SFT
Viewer
•
Updated
•
128k
•
59
•
3
SaiedAlshahrani/Moroccan_Arabic_Wikipedia_20230101_bots
Viewer
•
Updated
•
5.4k
•
29
brighter-dataset/BRIGHTER-emotion-intensities
Viewer
•
Updated
•
41.2k
•
297
•
4
LLaMAX/BenchMAX_Domain_Translation
Viewer
•
Updated
•
47.3k
•
261
LLaMAX/BenchMAX_Rule-based
Viewer
•
Updated
•
7.29k
•
465
•
2
ELYADATA & LIA at NADI 2025: ASR and ADI Subtasks
Paper
•
2511.10090
•
Published
Viewer
•
Updated
•
393k
•
7.28k
•
513
Omar-youssef/islamic-qa-egyptian-arabic
Viewer
•
Updated
•
7.47k
•
23
alconost/alconost-multilingual-speech-en-ja-ar-pl-v1
Viewer
•
Updated
•
280
•
36
LLaMAX/BenchMAX_Question_Answering
Viewer
•
Updated
•
17
•
79
2A2I/Arabic-OpenHermes-2.5
Viewer
•
Updated
•
982k
•
277
•
20
FreedomIntelligence/ApolloMoEDataset
Viewer
•
Updated
•
293k
•
146
•
5
SaiedAlshahrani/Arabic_Wikipedia_20230101_bots
Viewer
•
Updated
•
1.09M
•
73
•
1
UBC-NLP/palmx_2025_subtask1_culture
Viewer
•
Updated
•
4.5k
•
84
•
1
Viewer
•
Updated
•
17.6M
•
58
•
4
Viewer
•
Updated
•
8.79k
•
463
•
41
Viewer
•
Updated
•
158k
•
144
•
7
UBC-NLP/nilechat-fw-edu-egy
Viewer
•
Updated
•
5.52M
•
25
•
2
LLaMAX/BenchMAX_Model-based
Viewer
•
Updated
•
8.5k
•
150
Viewer
•
Updated
•
180
•
1.27k
•
1
Raniahossam33/Arabic_cultural_dataset
Viewer
•
Updated
•
12.1k
•
5
•
2
Preview
•
Updated
•
22
Viewer
•
Updated
•
380M
•
35.2k
•
41
Viewer
•
Updated
•
7.18B
•
39.9k
•
570
visheratin/laion-coco-nllb
Viewer
•
Updated
•
894k
•
1.13k
•
44
obadx/recitation-segmentation-augmented
Viewer
•
Updated
•
64.6k
•
125
Viewer
•
Updated
•
159M
•
10.8k
•
12
Viewer
•
Updated
•
2.56M
•
25.3k
•
77
Viewer
•
Updated
•
602k
•
9.33k
•
144
Viewer
•
Updated
•
13.2k
•
7.57k
•
2
rabah2026/Quran-Ayah-Corpus
Viewer
•
Updated
•
263k
•
909
•
1
omar-emad/FinanceTripletSecond
Viewer
•
Updated
•
30
•
13
Viewer
•
Updated
•
3.3k
•
89
•
8
Viewer
•
Updated
•
6.98k
•
107
•
8
Viewer
•
Updated
•
1.05M
•
96
•
12
UBC-NLP/palmx_2025_subtask2_islamic
Viewer
•
Updated
•
1.9k
•
28
Viewer
•
Updated
•
388
•
128
rubricreward/m-reward-bench
Viewer
•
Updated
•
66k
•
21
Fujitsu-FRE/MAPS_Verified
Viewer
•
Updated
•
3.05k
•
3.78k
•
2
Viewer
•
Updated
•
135k
•
1.7k
•
279
LLaMAX/BenchMAX_Multiple_Functions
Viewer
•
Updated
•
5.41k
•
116
Fumika/Wikinews-multilingual
Viewer
•
Updated
•
15.2k
•
64
•
7
Omartificial-Intelligence-Space/awesome_chatgpt_prompts_ar
Viewer
•
Updated
•
201
•
30
•
1
mrlbenchmarks/global-piqa-nonparallel
Viewer
•
Updated
•
11.6k
•
2.55k
•
31
NAMAA-Space/QariOCR-v0.3-markdown-mixed-dataset
Viewer
•
Updated
•
37k
•
162
•
10
Viewer
•
Updated
•
1.49M
•
42
•
2
Viewer
•
Updated
•
23k
•
506
•
1
m0pper/Small-Multilingual-Corpora
Viewer
•
Updated
•
7.61M
•
84
Viewer
•
Updated
•
236k
•
10
Preview
•
Updated
•
7
haoranxu/X-ALMA-Preference
Viewer
•
Updated
•
772k
•
163
•
6
SaiedAlshahrani/Arabic_Wikipedia_20230101_nobots
Viewer
•
Updated
•
847k
•
152
•
2
Viewer
•
Updated
•
367
•
15
•
2
vgaraujov/semeval-2025-task11-track-c
Viewer
•
Updated
•
57.3k
•
105
Viewer
•
Updated
•
935
•
1.65k
•
1
Viewer
•
Updated
•
3.94k
•
1.08k
Viewer
•
Updated
•
7.62k
•
6.61k
•
3
Viewer
•
Updated
•
10.4k
•
2.85k
•
35
Updated
•
1.83k
•
123
brighter-dataset/BRIGHTER-emotion-categories
Viewer
•
Updated
•
140k
•
896
•
14
lukasellinger/homonym-mcl-wic
Viewer
•
Updated
•
1.61k
•
15
Viewer
•
Updated
•
160
•
32
•
3
Preview
•
Updated
•
28
HeshamHaroon/Arabic_Function_Calling
Viewer
•
Updated
•
50.8k
•
229
•
56