Miyu Horiuchi commited on
Commit
6b52ab8
Β·
1 Parent(s): 2ae21d7

Final cleanup: sync OVERNIGHT_SUMMARY.md + fix size display for small files

Browse files

The size display showed "0.0 MB" for sub-100KB files (eval_report.md,
baseline_results.json, etc). Now switches to KB for files under 100KB.

This is the final commit of the overnight run. Headline result lives in
artifacts/eval_report.md (regression MAE 3.17Β°C on optimal_temperature_c,
+43% over the always-mean baseline; classification F1 0.28 on
oxygen_requirement, +294% over majority).

What to do when you wake up:
1. Read OVERNIGHT_SUMMARY.md for the high-level status
2. Read artifacts/eval_report.md for the full metrics + per-family breakdown
3. Skim git log --oneline for the full commit timeline
4. Rotate the NCBI API key (it appeared in chat earlier)
5. Stop caffeinate and revert any battery/display settings

OVERNIGHT_SUMMARY.md CHANGED
@@ -1,6 +1,6 @@
1
  # Overnight run β€” summary
2
 
3
- _Written 2026-04-27T02:16+00:00_
4
 
5
  ## Pipeline status
6
 
@@ -22,14 +22,15 @@ _Written 2026-04-27T02:16+00:00_
22
 
23
  ## Files of interest
24
 
25
- - βœ… `artifacts/eval_report.md` β€” headline result + metrics 0.0 MB
26
- - βœ… `artifacts/baseline_results.json` β€” machine-readable per-fold scores 0.0 MB
27
  - βœ… `data/bacdive_phenotypes.parquet` β€” phenotype labels (gitignored) 1.7 MB
28
  - βœ… `data/features.parquet` β€” extracted genome features (gitignored) 5.4 MB
29
  - βœ… `data/training_table.parquet` β€” merged + group-keyed table used for training (gitignored) 5.8 MB
30
 
31
  ## Commits since yesterday
32
 
 
33
  - 72e12e7 Make OVERNIGHT_SUMMARY.md write atomic (avoid race with regen loop)
34
  - 2ea77d1 Add v1 composition features (tetranucleotides + codon usage)
35
  - 316196d Fix predictions parquet type mix + plumb feature_cols through eval
 
1
  # Overnight run β€” summary
2
 
3
+ _Written 2026-04-27T02:37+00:00_
4
 
5
  ## Pipeline status
6
 
 
22
 
23
  ## Files of interest
24
 
25
+ - βœ… `artifacts/eval_report.md` β€” headline result + metrics 9.6 KB
26
+ - βœ… `artifacts/baseline_results.json` β€” machine-readable per-fold scores 8.8 KB
27
  - βœ… `data/bacdive_phenotypes.parquet` β€” phenotype labels (gitignored) 1.7 MB
28
  - βœ… `data/features.parquet` β€” extracted genome features (gitignored) 5.4 MB
29
  - βœ… `data/training_table.parquet` β€” merged + group-keyed table used for training (gitignored) 5.8 MB
30
 
31
  ## Commits since yesterday
32
 
33
+ - 17518a3 Final overnight commit: trained baseline + eval report + summary
34
  - 72e12e7 Make OVERNIGHT_SUMMARY.md write atomic (avoid race with regen loop)
35
  - 2ea77d1 Add v1 composition features (tetranucleotides + codon usage)
36
  - 316196d Fix predictions parquet type mix + plumb feature_cols through eval
scripts/05_overnight_summary.py CHANGED
@@ -155,8 +155,12 @@ def main() -> None:
155
  for rel, desc in files_of_interest:
156
  path = ROOT / rel
157
  marker = "βœ…" if path.exists() else "β€”"
158
- size_mb = f"{path.stat().st_size / 1e6:.1f} MB" if path.exists() else ""
159
- lines.append(f"- {marker} `{rel}` β€” {desc} {size_mb}")
 
 
 
 
160
  lines.append("")
161
 
162
  # Commits overnight
 
155
  for rel, desc in files_of_interest:
156
  path = ROOT / rel
157
  marker = "βœ…" if path.exists() else "β€”"
158
+ if path.exists():
159
+ size = path.stat().st_size
160
+ size_label = f"{size / 1e6:.1f} MB" if size >= 100_000 else f"{size / 1e3:.1f} KB"
161
+ else:
162
+ size_label = ""
163
+ lines.append(f"- {marker} `{rel}` β€” {desc} {size_label}")
164
  lines.append("")
165
 
166
  # Commits overnight