Spaces:
Sleeping
A newer version of the Gradio SDK is available:
6.1.0
HuggingFace Space Setup Guide - FBMC Chronos 2
IMPORTANT: This is Day 3, Hour 1-4 of the implementation plan. Complete all steps before proceeding to inference pipeline development.
Prerequisites
- HuggingFace account: https://huggingface.co/join
- HuggingFace write token: https://huggingface.co/settings/tokens
- Git installed locally
- Project files ready at:
C:\Users\evgue\projects\fbmc_chronos2
STEP 1: Create HuggingFace Dataset Repository (10 min)
1.1 Create Dataset on HuggingFace Web UI
- Go to: https://huggingface.co/new-dataset
- Fill in:
- Owner: YOUR_USERNAME
- Dataset name:
fbmc-features-24month - License: MIT
- Visibility: Private (contains project data)
- Click "Create dataset"
1.2 Upload Data to Dataset
Option A: Using the upload script (Recommended)
# 1. Add your HF token to .env file
echo "HF_TOKEN=hf_..." >> .env
# 2. Edit the script to replace YOUR_USERNAME with your actual HF username
# Edit: scripts/upload_to_hf_datasets.py
# Replace all instances of "YOUR_USERNAME" with your HuggingFace username
# 3. Install required packages
.venv\Scripts\uv.exe pip install datasets huggingface-hub
# 4. Run the upload script
.venv\Scripts\python.exe scripts\upload_to_hf_datasets.py
The script will upload:
features_unified_24month.parquet(~25 MB)metadata.csv(2,553 features)target_borders.txt(38 target borders)
Option B: Manual upload via web UI
- Go to: https://huggingface.co/datasets/YOUR_USERNAME/fbmc-features-24month
- Click "Files" tab → "Add file" → "Upload files"
- Upload:
data/processed/features_unified_24month.parquetdata/processed/features_unified_metadata.csv(rename tometadata.csv)data/processed/target_borders_list.txt(rename totarget_borders.txt)
1.3 Verify Dataset Uploaded
Visit: https://huggingface.co/datasets/YOUR_USERNAME/fbmc-features-24month
You should see:
features_unified_24month.parquet(~25 MB)metadata.csv(~200 KB)target_borders.txt(~1 KB)
STEP 2: Create HuggingFace Space (15 min)
2.1 Create Space on HuggingFace Web UI
- Go to: https://huggingface.co/new-space
- Fill in:
- Owner: YOUR_USERNAME
- Space name:
fbmc-chronos2-forecast - License: MIT
- Select SDK: JupyterLab
- Space hardware: Click "Advanced" → Select A10G GPU (24GB) ($30/month)
- Visibility: Private (contains API keys)
- Click "Create Space"
IMPORTANT: The Space will start building immediately. This takes ~10-15 minutes for first build.
2.2 Configure Space Secrets
While the Space is building:
Go to Space → Settings → Variables and Secrets
Add these secrets (click "New secret"):
Name Value Description HF_TOKENhf_...Your HuggingFace write token ENTSOE_API_KEYyour_keyENTSO-E Transparency API key Click "Save"
2.3 Wait for Initial Build
- Monitor build logs: Space → Logs tab
- Wait for message: "Your Space is up and running"
- This can take 10-15 minutes for first build
STEP 3: Clone Space Locally (5 min)
3.1 Clone the Space Repository
# Navigate to projects directory
cd C:\Users\evgue\projects
# Clone the Space (replace YOUR_USERNAME)
git clone https://huggingface.co/spaces/YOUR_USERNAME/fbmc-chronos2-forecast
# Navigate into Space directory
cd fbmc-chronos2-forecast
3.2 Copy Project Files to Space
# Copy source code
cp -r ../fbmc_chronos2/src ./
# Copy requirements (rename to requirements.txt)
cp ../fbmc_chronos2/hf_space_requirements.txt ./requirements.txt
# Copy .env.example (for documentation)
cp ../fbmc_chronos2/.env.example ./
# Create directories
mkdir -p data/evaluation
mkdir -p notebooks
mkdir -p tests
3.3 Create Space README.md
Create README.md in the Space directory with:
---
title: FBMC Chronos 2 Forecast
emoji: ⚡
colorFrom: blue
colorTo: green
sdk: jupyterlab
sdk_version: "4.0.0"
app_file: app.py
pinned: false
license: mit
hardware: a10g-small
---
# FBMC Flow Forecasting - Zero-Shot Inference
Amazon Chronos 2 for cross-border capacity forecasting.
## Features
- 2,553 features (615 future covariates)
- 38 bidirectional border targets (19 physical borders)
- 8,192-hour context window
- Dynamic date-driven inference
- A10G GPU acceleration
## Quick Start
### Launch JupyterLab
1. Open this Space
2. Wait for build to complete (~10-15 min first time)
3. Click "Open in JupyterLab"
### Run Inference
See `notebooks/01_test_inference.ipynb` for examples.
## Data Source
- **Dataset**: [YOUR_USERNAME/fbmc-features-24month](https://huggingface.co/datasets/YOUR_USERNAME/fbmc-features-24month)
- **Size**: 25 MB (17,544 hours × 2,553 features)
- **Period**: Oct 2023 - Sept 2025
## Model
- **Chronos 2 Large** (710M parameters)
- **Pretrained**: amazon/chronos-t5-large
- **Zero-shot**: No fine-tuning in MVP
## Cost
- A10G GPU: $30/month
- Storage: <1 GB (free tier)
3.4 Push Initial Files to Space
# Stage files
git add README.md requirements.txt .env.example src/
# Commit
git commit -m "feat: initial Space setup with A10G GPU and source code"
# Push to HuggingFace
git push
IMPORTANT: After pushing, the Space will rebuild (~10-15 min). Monitor the build in the Logs tab.
STEP 4: Test Space Environment (10 min)
4.1 Wait for Build to Complete
- Go to Space → Logs tab
- Wait for: "Your Space is up and running"
- If build fails, check requirements.txt for dependency conflicts
4.2 Open JupyterLab
- Go to your Space: https://huggingface.co/spaces/YOUR_USERNAME/fbmc-chronos2-forecast
- Click "Open in JupyterLab" (top right)
- JupyterLab will open in new tab
4.3 Create Test Notebook
In JupyterLab, create notebooks/00_test_setup.ipynb:
Cell 1: Test GPU
import torch
print(f"GPU available: {torch.cuda.is_available()}")
print(f"GPU device: {torch.cuda.get_device_name(0) if torch.cuda.is_available() else 'None'}")
print(f"GPU memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")
Expected output:
GPU available: True
GPU device: NVIDIA A10G
GPU memory: 22.73 GB
Cell 2: Load Dataset
from datasets import load_dataset
import polars as pl
# Load unified features from HF Dataset
dataset = load_dataset("YOUR_USERNAME/fbmc-features-24month", split="train")
df = pl.from_pandas(dataset.to_pandas())
print(f"Shape: {df.shape[0]:,} rows × {df.shape[1]:,} columns")
print(f"Columns: {df.columns[:10]}")
print(f"Date range: {df['timestamp'].min()} to {df['timestamp'].max()}")
Expected output:
Shape: 17,544 rows × 2,553 columns
Columns: ['timestamp', 'cnec_t1_binding_10T-DE-FR-000068', ...]
Date range: 2023-10-01 00:00:00 to 2025-09-30 23:00:00
Cell 3: Load Metadata
import pandas as pd
# Load metadata
metadata = pd.read_csv(
"hf://datasets/YOUR_USERNAME/fbmc-features-24month/metadata.csv"
)
# Check future covariates
future_covs = metadata[metadata['is_future_covariate'] == 'true']['feature_name'].tolist()
print(f"Future covariates: {len(future_covs)}")
print(f"Historical features: {len(metadata) - len(future_covs)}")
print(f"\nCategories: {metadata['category'].unique()}")
Expected output:
Future covariates: 615
Historical features: 1,938
Categories: ['CNEC_Tier1', 'CNEC_Tier2', 'Weather', 'LTA', 'Temporal', ...]
Cell 4: Test Chronos 2 Loading
from chronos import ChronosPipeline
# Load Chronos 2 Large (this will download ~3 GB on first run)
print("Loading Chronos 2 Large...")
pipeline = ChronosPipeline.from_pretrained(
"amazon/chronos-t5-large",
device_map="cuda",
torch_dtype=torch.bfloat16
)
print("[OK] Chronos 2 loaded successfully")
print(f"Model device: {pipeline.model.device}")
Expected output:
Loading Chronos 2 Large...
[OK] Chronos 2 loaded successfully
Model device: cuda:0
IMPORTANT: The first time you load Chronos 2, it will download ~3 GB. This takes 5-10 minutes. Subsequent runs will use cached model.
4.4 Run All Cells
- Execute all cells in order
- Verify all outputs match expected results
- If any cell fails, check error messages and troubleshoot
STEP 5: Commit Test Notebook to Space
# In JupyterLab terminal or locally
git add notebooks/00_test_setup.ipynb
git commit -m "test: verify GPU, data loading, and Chronos 2 model"
git push
Troubleshooting
Build Fails
Error: Collecting chronos-forecasting>=2.0.0: Could not find a version...
- Fix: Check chronos-forecasting version exists on PyPI
- Try:
chronos-forecasting==2.0.0(pin exact version)
Error: torch 2.0.0 conflicts with transformers...
- Fix: Pin compatible versions in requirements.txt
- Try:
torch==2.1.0andtransformers==4.36.0
GPU Not Detected
Issue: GPU available: False
- Check: Space Settings → Hardware → Should show "A10G"
- Fix: Restart Space (Settings → Restart Space)
Dataset Not Found
Error: Repository Not Found for url: https://huggingface.co/datasets/...
- Check: Dataset name matches in code
- Fix: Replace
YOUR_USERNAMEwith actual HuggingFace username - Verify: Dataset is public or HF_TOKEN is set in Space secrets
Out of Memory
Error: CUDA out of memory
- Cause: A10G has 24 GB VRAM, may not be enough for 8,192 context + large batch
- Fix: Reduce context window to 512 hours temporarily
- Fix: Process borders in smaller batches (10 at a time)
Next Steps (Day 3, Hours 5-8)
Once the test notebook runs successfully:
- Hour 5-6: Create
src/inference/data_fetcher.py(AsOfDateFetcher class) - Hour 7-8: Create
src/inference/chronos_pipeline.py(ChronosForecaster class) - Smoke test: Run inference on 1 border × 7 days
See main implementation plan for details.
Success Criteria
At end of STEP 5, you should have:
- HF Dataset repository created and populated (3 files)
- HF Space created with A10G GPU ($30/month)
- Space secrets configured (HF_TOKEN, ENTSOE_API_KEY)
- Source code pushed to Space
- Space builds successfully (~10-15 min)
- JupyterLab accessible
- GPU detected (NVIDIA A10G, 22.73 GB)
- Dataset loads (17,544 × 2,553)
- Metadata loads (2,553 features, 615 future covariates)
- Chronos 2 loads successfully (~3 GB download first time)
- Test notebook committed to Space
Estimated time: ~40 minutes active work + ~25 minutes waiting for builds
Questions? Check HuggingFace Spaces documentation: https://huggingface.co/docs/hub/spaces