Spaces:

miyuiu
/

microbe-model

Running

Miyu Horiuchi commited on about 1 month ago

Commit

6b52ab8

1 Parent(s): 2ae21d7

Final cleanup: sync OVERNIGHT_SUMMARY.md + fix size display for small files

The size display showed "0.0 MB" for sub-100KB files (eval_report.md,
baseline_results.json, etc). Now switches to KB for files under 100KB.

This is the final commit of the overnight run. Headline result lives in
artifacts/eval_report.md (regression MAE 3.17°C on optimal_temperature_c,
+43% over the always-mean baseline; classification F1 0.28 on
oxygen_requirement, +294% over majority).

What to do when you wake up:
1. Read OVERNIGHT_SUMMARY.md for the high-level status
2. Read artifacts/eval_report.md for the full metrics + per-family breakdown
3. Skim git log --oneline for the full commit timeline
4. Rotate the NCBI API key (it appeared in chat earlier)
5. Stop caffeinate and revert any battery/display settings

Files changed (2) hide show

OVERNIGHT_SUMMARY.md +4 -3
scripts/05_overnight_summary.py +6 -2

OVERNIGHT_SUMMARY.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # Overnight run — summary
-_Written 2026-04-27T02:16+00:00_
 ## Pipeline status
@@ -22,14 +22,15 @@ _Written 2026-04-27T02:16+00:00_
 ## Files of interest
-- ✅ `artifacts/eval_report.md` — headline result + metrics 0.0 MB
-- ✅ `artifacts/baseline_results.json` — machine-readable per-fold scores 0.0 MB
 - ✅ `data/bacdive_phenotypes.parquet` — phenotype labels (gitignored) 1.7 MB
 - ✅ `data/features.parquet` — extracted genome features (gitignored) 5.4 MB
 - ✅ `data/training_table.parquet` — merged + group-keyed table used for training (gitignored) 5.8 MB
 ## Commits since yesterday
 - 72e12e7 Make OVERNIGHT_SUMMARY.md write atomic (avoid race with regen loop)
 - 2ea77d1 Add v1 composition features (tetranucleotides + codon usage)
 - 316196d Fix predictions parquet type mix + plumb feature_cols through eval

 # Overnight run — summary
+_Written 2026-04-27T02:37+00:00_
 ## Pipeline status
 ## Files of interest
+- ✅ `artifacts/eval_report.md` — headline result + metrics 9.6 KB
+- ✅ `artifacts/baseline_results.json` — machine-readable per-fold scores 8.8 KB
 - ✅ `data/bacdive_phenotypes.parquet` — phenotype labels (gitignored) 1.7 MB
 - ✅ `data/features.parquet` — extracted genome features (gitignored) 5.4 MB
 - ✅ `data/training_table.parquet` — merged + group-keyed table used for training (gitignored) 5.8 MB
 ## Commits since yesterday
+- 17518a3 Final overnight commit: trained baseline + eval report + summary
 - 72e12e7 Make OVERNIGHT_SUMMARY.md write atomic (avoid race with regen loop)
 - 2ea77d1 Add v1 composition features (tetranucleotides + codon usage)
 - 316196d Fix predictions parquet type mix + plumb feature_cols through eval

scripts/05_overnight_summary.py CHANGED Viewed

@@ -155,8 +155,12 @@ def main() -> None:
     for rel, desc in files_of_interest:
         path = ROOT / rel
         marker = "✅" if path.exists() else "—"
-        size_mb = f"{path.stat().st_size / 1e6:.1f} MB" if path.exists() else ""
-        lines.append(f"- {marker} `{rel}` — {desc} {size_mb}")
     lines.append("")
     # Commits overnight

     for rel, desc in files_of_interest:
         path = ROOT / rel
         marker = "✅" if path.exists() else "—"
+        if path.exists():
+            size = path.stat().st_size
+            size_label = f"{size / 1e6:.1f} MB" if size >= 100_000 else f"{size / 1e3:.1f} KB"
+        else:
+            size_label = ""
+        lines.append(f"- {marker} `{rel}` — {desc} {size_label}")
     lines.append("")
     # Commits overnight