01Yassine commited on
Commit
9f15275
verified
1 Parent(s): 8ad54b2

Update index.html

Browse files
Files changed (1) hide show
  1. index.html +60 -1
index.html CHANGED
@@ -285,7 +285,7 @@
285
  <strong>Note:</strong> no extra spaces, single CSV, no archives.
286
  </p>
287
 
288
- <h2>Evaluation Criteria</h2>
289
  <p>
290
  The Leaderboard is based on phoneme-level <strong>F1-score</strong>.
291
  We use a hierarchical evaluation (detection + diagnostic) per <a href="https://arxiv.org/pdf/2310.13974" target="_blank">MDD Overview</a>.
@@ -315,8 +315,67 @@
315
  <li>Recall = TR/(TR+FA)</li>
316
  <li>F1 = 2路P路R/(P+R)</li>
317
  </ul>
 
 
 
 
 
 
 
 
 
318
  </p>
 
319
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
320
  <h2>Suggested Research Directions</h2>
321
  <ol>
322
  <li>
 
285
  <strong>Note:</strong> no extra spaces, single CSV, no archives.
286
  </p>
287
 
288
+ <!-- <h2>Evaluation Criteria</h2>
289
  <p>
290
  The Leaderboard is based on phoneme-level <strong>F1-score</strong>.
291
  We use a hierarchical evaluation (detection + diagnostic) per <a href="https://arxiv.org/pdf/2310.13974" target="_blank">MDD Overview</a>.
 
315
  <li>Recall = TR/(TR+FA)</li>
316
  <li>F1 = 2路P路R/(P+R)</li>
317
  </ul>
318
+ </p> -->
319
+
320
+ <h2>Evaluation Criteria</h2>
321
+
322
+ <div style="background-color: #f0f8ff; border-left: 5px solid #007bff; padding: 15px; margin-bottom: 20px;">
323
+ <h3 style="margin-top: 0; color: #007bff;">馃弳 Primary Metric</h3>
324
+ <p style="margin-bottom: 0;">
325
+ The Leaderboard is ranked primarily by the <strong>Phoneme-level F1-score</strong>.
326
+ While other metrics (FRR, FAR, DER) are computed for analysis, <strong>F1</strong> determines the final standing.
327
  </p>
328
+ </div>
329
 
330
+ <p>
331
+ We use a hierarchical evaluation strategy (detection + diagnostic) based on the
332
+ <a href="https://arxiv.org/pdf/2310.13974" target="_blank">MDD Overview</a> framework.
333
+ </p>
334
+
335
+ <h3>1. Input Definitions</h3>
336
+ <ul>
337
+ <li><strong>What is said:</strong> The annotated phoneme sequence.</li>
338
+ <li><strong>What is predicted:</strong> The output from your model.</li>
339
+ <li><strong>What should have been said:</strong> The reference (target) sequence.</li>
340
+ </ul>
341
+
342
+ <h3>2. Confusion Matrix Components</h3>
343
+ <p>From the inputs above, we compute the following counts:</p>
344
+ <table style="width: 100%; border-collapse: collapse; margin-bottom: 20px;">
345
+ <tr style="background-color: #f9f9f9; border-bottom: 1px solid #ddd;">
346
+ <td style="padding: 8px;"><strong>TA (True Accept)</strong></td>
347
+ <td style="padding: 8px;">Correct phonemes properly accepted.</td>
348
+ </tr>
349
+ <tr style="border-bottom: 1px solid #ddd;">
350
+ <td style="padding: 8px;"><strong>TR (True Reject)</strong></td>
351
+ <td style="padding: 8px;">Mispronunciations correctly detected.</td>
352
+ </tr>
353
+ <tr style="background-color: #f9f9f9; border-bottom: 1px solid #ddd;">
354
+ <td style="padding: 8px;"><strong>FR (False Reject)</strong></td>
355
+ <td style="padding: 8px;">Correct phonemes incorrectly flagged as errors.</td>
356
+ </tr>
357
+ <tr>
358
+ <td style="padding: 8px;"><strong>FA (False Accept)</strong></td>
359
+ <td style="padding: 8px;">Mispronunciations missed (labeled as correct).</td>
360
+ </tr>
361
+ </table>
362
+
363
+ <h3>3. Calculated Metrics</h3>
364
+
365
+ <h4>Detection Metrics (Leaderboard Ranking)</h4>
366
+ <ul>
367
+ <li><strong>Precision:</strong> TR / (TR + FR)</li>
368
+ <li><strong>Recall:</strong> TR / (TR + FA)</li>
369
+ <li><strong>F1-Score:</strong> 2 路 (Precision 路 Recall) / (Precision + Recall)</li>
370
+ </ul>
371
+
372
+ <h4>Diagnostic Rates (Auxiliary)</h4>
373
+ <ul>
374
+ <li><strong>FRR (False Reject Rate):</strong> FR / (TA + FR)</li>
375
+ <li><strong>FAR (False Accept Rate):</strong> FA / (FA + TR)</li>
376
+ <li><strong>DER (Diagnostic Error Rate):</strong> DE / (CD + DE)</li>
377
+ </ul>
378
+
379
  <h2>Suggested Research Directions</h2>
380
  <ol>
381
  <li>