Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
Spaces:
miyuiu
/
microbe-model
like
0
Running
App
Files
Files
Community
Fetching metadata from the HF Docker repository...
main
microbe-model
/
scripts
98.7 kB
Ctrl+K
Ctrl+K
2 contributors
History:
22 commits
Miyu Horiuchi
Add unified strain catalog (100K rows w/ provenance) + selective weak supervision for pH
4c18dfd
21 days ago
01_fetch_bacdive.py
2.59 kB
Rewrite BacDive client for v2 public API (no auth required)
about 1 month ago
02_fetch_and_featurize.py
3.1 kB
Streaming fetch+featurize pipeline + 6Γ pyrodigal speedup + GCA version resolution
30 days ago
03_train_baseline.py
4.53 kB
Add MediaDive-derived features (medium pH, NaCl, n_media) β all 4 targets improve
21 days ago
04_eval.py
947 Bytes
Eval report enhancements: TL;DR + per-strain predictions + per-family error
30 days ago
05_overnight_summary.py
7.45 kB
Final cleanup: sync OVERNIGHT_SUMMARY.md + fix size display for small files
29 days ago
06_fetch_gtdb_candidates.py
5.19 kB
Fix GTDB column names + accession resolution for v226 metadata schema
29 days ago
07_predict_uncultured.py
5.99 kB
Phase C scaffolding: GTDB candidate selection + uncultured prediction
29 days ago
08_extract_strain_media.py
1.31 kB
Phase E scaffolding: MediaDive integration + strainβmedium links
29 days ago
09_fetch_media_recipes.py
2.89 kB
Phase E scaffolding: MediaDive integration + strainβmedium links
29 days ago
10_train_media_recommender.py
2.72 kB
Phase E #2: scripts/recommend.py β single-genome β ranked media + phenotype CLI
29 days ago
11_extract_embeddings.py
5.58 kB
v2 scaffolding: ESM-2 embedding extraction + GPU runner doc
29 days ago
12_train_with_embeddings.py
2.16 kB
v2 scaffolding: ESM-2 embedding extraction + GPU runner doc
29 days ago
13_compare_v1_v2.py
3.28 kB
v2 results: ESM-2 t6 (8M, 20-protein sample) loses to v1 hand-crafted features on all 4 phenotype targets
27 days ago
14_train_combined.py
3.72 kB
Expand training corpus to 46K strains: species-name β NCBI genome + isolation features
21 days ago
15_train_phenotype_heads.py
4.63 kB
Add unified strain catalog (100K rows w/ provenance) + selective weak supervision for pH
21 days ago
16_score_uncultured_media.py
3.44 kB
UI prep: pre-train phenotype heads + pre-score uncultured media
27 days ago
17_reextract_phenotypes.py
2.25 kB
Fix _derive_salt to pick optimum entries β salt MAE 2.52 β 2.17 (-13.7%)
21 days ago
18_resolve_species_to_genome.py
4.22 kB
Expand training corpus to 46K strains: species-name β NCBI genome + isolation features
21 days ago
19_featurize_resolved.py
4.18 kB
Expand training corpus to 46K strains: species-name β NCBI genome + isolation features
21 days ago
20_build_mediadive_features.py
3.51 kB
Add MediaDive-derived features (medium pH, NaCl, n_media) β all 4 targets improve
21 days ago
21_build_strain_catalog.py
4.32 kB
Add unified strain catalog (100K rows w/ provenance) + selective weak supervision for pH
21 days ago
22_train_with_weak_labels.py
3.34 kB
Add unified strain catalog (100K rows w/ provenance) + selective weak supervision for pH
21 days ago
23_weak_label_apples_to_apples.py
5.19 kB
Add unified strain catalog (100K rows w/ provenance) + selective weak supervision for pH
21 days ago
recommend.py
11.2 kB
UI prep: pre-train phenotype heads + pre-score uncultured media
27 days ago
run_train_and_eval.sh
1.06 kB
Harden post-featurize chain: each phase runs even if previous fails
30 days ago