Spaces:

evgueni-p
/

fbmc-chronos2

Sleeping

Go to: https://huggingface.co/new-dataset
Fill in:
- Owner: YOUR_USERNAME
- Dataset name: fbmc-features-24month
- License: MIT
- Visibility: Private (contains project data)
Click "Create dataset"

1.2 Upload Data to Dataset

Option A: Using the upload script (Recommended)

# 1. Add your HF token to .env file
echo "HF_TOKEN=hf_..." >> .env

# 2. Edit the script to replace YOUR_USERNAME with your actual HF username
# Edit: scripts/upload_to_hf_datasets.py
# Replace all instances of "YOUR_USERNAME" with your HuggingFace username

# 3. Install required packages
.venv\Scripts\uv.exe pip install datasets huggingface-hub

# 4. Run the upload script
.venv\Scripts\python.exe scripts\upload_to_hf_datasets.py

The script will upload:

features_unified_24month.parquet (~25 MB)
metadata.csv (2,553 features)
target_borders.txt (38 target borders)

Option B: Manual upload via web UI

Go to: https://huggingface.co/datasets/YOUR_USERNAME/fbmc-features-24month
Click "Files" tab → "Add file" → "Upload files"
Upload:
- data/processed/features_unified_24month.parquet
- data/processed/features_unified_metadata.csv (rename to metadata.csv)
- data/processed/target_borders_list.txt (rename to target_borders.txt)

1.3 Verify Dataset Uploaded

Visit: https://huggingface.co/datasets/YOUR_USERNAME/fbmc-features-24month

You should see:

features_unified_24month.parquet (~25 MB)
metadata.csv (~200 KB)
target_borders.txt (~1 KB)

STEP 2: Create HuggingFace Space (15 min)

2.1 Create Space on HuggingFace Web UI

Go to: https://huggingface.co/new-space
Fill in:
- Owner: YOUR_USERNAME
- Space name: fbmc-chronos2-forecast
- License: MIT
- Select SDK: JupyterLab
- Space hardware: Click "Advanced" → Select A10G GPU (24GB) ($30/month)
- Visibility: Private (contains API keys)
Click "Create Space"

IMPORTANT: The Space will start building immediately. This takes ~10-15 minutes for first build.

2.2 Configure Space Secrets

While the Space is building:

Go to Space → Settings → Variables and Secrets
Add these secrets (click "New secret"):

Name Value Description

HF_TOKEN hf_... Your HuggingFace write token

ENTSOE_API_KEY your_key ENTSO-E Transparency API key
Click "Save"

Name	Value	Description
`HF_TOKEN`	`hf_...`	Your HuggingFace write token
`ENTSOE_API_KEY`	`your_key`	ENTSO-E Transparency API key

2.3 Wait for Initial Build

Monitor build logs: Space → Logs tab
Wait for message: "Your Space is up and running"
This can take 10-15 minutes for first build

STEP 3: Clone Space Locally (5 min)

3.1 Clone the Space Repository

# Navigate to projects directory
cd C:\Users\evgue\projects

# Clone the Space (replace YOUR_USERNAME)
git clone https://huggingface.co/spaces/YOUR_USERNAME/fbmc-chronos2-forecast

# Navigate into Space directory
cd fbmc-chronos2-forecast

3.2 Copy Project Files to Space

# Copy source code
cp -r ../fbmc_chronos2/src ./

# Copy requirements (rename to requirements.txt)
cp ../fbmc_chronos2/hf_space_requirements.txt ./requirements.txt

# Copy .env.example (for documentation)
cp ../fbmc_chronos2/.env.example ./

# Create directories
mkdir -p data/evaluation
mkdir -p notebooks
mkdir -p tests

3.3 Create Space README.md

Create README.md in the Space directory with:

---
title: FBMC Chronos 2 Forecast
emoji: ⚡
colorFrom: blue
colorTo: green
sdk: jupyterlab
sdk_version: "4.0.0"
app_file: app.py
pinned: false
license: mit
hardware: a10g-small
---

# FBMC Flow Forecasting - Zero-Shot Inference

Amazon Chronos 2 for cross-border capacity forecasting.

## Features
- 2,553 features (615 future covariates)
- 38 bidirectional border targets (19 physical borders)
- 8,192-hour context window
- Dynamic date-driven inference
- A10G GPU acceleration

## Quick Start

### Launch JupyterLab
1. Open this Space
2. Wait for build to complete (~10-15 min first time)
3. Click "Open in JupyterLab"

### Run Inference
See `notebooks/01_test_inference.ipynb` for examples.

## Data Source
- **Dataset**: [YOUR_USERNAME/fbmc-features-24month](https://huggingface.co/datasets/YOUR_USERNAME/fbmc-features-24month)
- **Size**: 25 MB (17,544 hours × 2,553 features)
- **Period**: Oct 2023 - Sept 2025

## Model
- **Chronos 2 Large** (710M parameters)
- **Pretrained**: amazon/chronos-t5-large
- **Zero-shot**: No fine-tuning in MVP

## Cost
- A10G GPU: $30/month
- Storage: <1 GB (free tier)

3.4 Push Initial Files to Space

# Stage files
git add README.md requirements.txt .env.example src/

# Commit
git commit -m "feat: initial Space setup with A10G GPU and source code"

# Push to HuggingFace
git push

IMPORTANT: After pushing, the Space will rebuild (~10-15 min). Monitor the build in the Logs tab.

STEP 4: Test Space Environment (10 min)

4.1 Wait for Build to Complete

Go to Space → Logs tab
Wait for: "Your Space is up and running"
If build fails, check requirements.txt for dependency conflicts

4.2 Open JupyterLab

Go to your Space: https://huggingface.co/spaces/YOUR_USERNAME/fbmc-chronos2-forecast
Click "Open in JupyterLab" (top right)
JupyterLab will open in new tab

4.3 Create Test Notebook

In JupyterLab, create notebooks/00_test_setup.ipynb:

Cell 1: Test GPU

import torch
print(f"GPU available: {torch.cuda.is_available()}")
print(f"GPU device: {torch.cuda.get_device_name(0) if torch.cuda.is_available() else 'None'}")
print(f"GPU memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")

Expected output:

GPU available: True
GPU device: NVIDIA A10G
GPU memory: 22.73 GB

Cell 2: Load Dataset

from datasets import load_dataset
import polars as pl

# Load unified features from HF Dataset
dataset = load_dataset("YOUR_USERNAME/fbmc-features-24month", split="train")
df = pl.from_pandas(dataset.to_pandas())

print(f"Shape: {df.shape[0]:,} rows × {df.shape[1]:,} columns")
print(f"Columns: {df.columns[:10]}")
print(f"Date range: {df['timestamp'].min()} to {df['timestamp'].max()}")

Expected output:

Shape: 17,544 rows × 2,553 columns
Columns: ['timestamp', 'cnec_t1_binding_10T-DE-FR-000068', ...]
Date range: 2023-10-01 00:00:00 to 2025-09-30 23:00:00

Cell 3: Load Metadata

import pandas as pd

# Load metadata
metadata = pd.read_csv(
    "hf://datasets/YOUR_USERNAME/fbmc-features-24month/metadata.csv"
)

# Check future covariates
future_covs = metadata[metadata['is_future_covariate'] == 'true']['feature_name'].tolist()
print(f"Future covariates: {len(future_covs)}")
print(f"Historical features: {len(metadata) - len(future_covs)}")
print(f"\nCategories: {metadata['category'].unique()}")

Expected output:

Future covariates: 615
Historical features: 1,938

Categories: ['CNEC_Tier1', 'CNEC_Tier2', 'Weather', 'LTA', 'Temporal', ...]

Cell 4: Test Chronos 2 Loading

from chronos import ChronosPipeline

# Load Chronos 2 Large (this will download ~3 GB on first run)
print("Loading Chronos 2 Large...")
pipeline = ChronosPipeline.from_pretrained(
    "amazon/chronos-t5-large",
    device_map="cuda",
    torch_dtype=torch.bfloat16
)
print("[OK] Chronos 2 loaded successfully")
print(f"Model device: {pipeline.model.device}")

Expected output:

Loading Chronos 2 Large...
[OK] Chronos 2 loaded successfully
Model device: cuda:0

IMPORTANT: The first time you load Chronos 2, it will download ~3 GB. This takes 5-10 minutes. Subsequent runs will use cached model.

4.4 Run All Cells

Execute all cells in order
Verify all outputs match expected results
If any cell fails, check error messages and troubleshoot

STEP 5: Commit Test Notebook to Space

# In JupyterLab terminal or locally
git add notebooks/00_test_setup.ipynb
git commit -m "test: verify GPU, data loading, and Chronos 2 model"
git push

Troubleshooting

Build Fails

Error: Collecting chronos-forecasting>=2.0.0: Could not find a version...

Fix: Check chronos-forecasting version exists on PyPI
Try: chronos-forecasting==2.0.0 (pin exact version)

Error: torch 2.0.0 conflicts with transformers...

Fix: Pin compatible versions in requirements.txt
Try: torch==2.1.0 and transformers==4.36.0

GPU Not Detected

Issue: GPU available: False

Check: Space Settings → Hardware → Should show "A10G"
Fix: Restart Space (Settings → Restart Space)

Dataset Not Found

Error: Repository Not Found for url: https://huggingface.co/datasets/...

Check: Dataset name matches in code
Fix: Replace YOUR_USERNAME with actual HuggingFace username
Verify: Dataset is public or HF_TOKEN is set in Space secrets

Out of Memory

Error: CUDA out of memory

Cause: A10G has 24 GB VRAM, may not be enough for 8,192 context + large batch
Fix: Reduce context window to 512 hours temporarily
Fix: Process borders in smaller batches (10 at a time)

Next Steps (Day 3, Hours 5-8)

Once the test notebook runs successfully:

Hour 5-6: Create src/inference/data_fetcher.py (AsOfDateFetcher class)
Hour 7-8: Create src/inference/chronos_pipeline.py (ChronosForecaster class)
Smoke test: Run inference on 1 border × 7 days

See main implementation plan for details.

Success Criteria

At end of STEP 5, you should have:

HF Dataset repository created and populated (3 files)
HF Space created with A10G GPU ($30/month)
Space secrets configured (HF_TOKEN, ENTSOE_API_KEY)
Source code pushed to Space
Space builds successfully (~10-15 min)
JupyterLab accessible
GPU detected (NVIDIA A10G, 22.73 GB)
Dataset loads (17,544 × 2,553)
Metadata loads (2,553 features, 615 future covariates)
Chronos 2 loads successfully (~3 GB download first time)
Test notebook committed to Space

Estimated time: ~40 minutes active work + ~25 minutes waiting for builds

Questions? Check HuggingFace Spaces documentation: https://huggingface.co/docs/hub/spaces