Spaces:

evgueni-p
/

fbmc-chronos2

Sleeping

Evgueni Poloukarov commited on 23 days ago

Commit

3254242

1 Parent(s): 5ff6f25

debug: add GPU memory profiling diagnostics to identify 36GB baseline usage

Investigation: Baseline memory consumption shows 36GB when expected is ~2GB
- Model: 710M params * 2 bytes (bfloat16) = 1.4GB expected
- Dataset: ~50MB expected
- Actual: 36.09GB allocated by PyTorch (34GB unaccounted!)

Added memory profiling:
- After model loading (chronos_inference.py:86-90)
- After dataset loading (chronos_inference.py:118-122)
- Logs GPU memory allocated and reserved at each step

This will help identify which loading step consumes the unexpected 34GB.

Related: OOM errors on 48GB GPU when trying 1,440-hour context window

Files changed (1) hide show

src/forecasting/chronos_inference.py +12 -0

src/forecasting/chronos_inference.py CHANGED Viewed

@@ -83,6 +83,12 @@ class ChronosInferencePipeline:
             print(f"Model loaded in {time.time() - start_time:.1f}s")
             print(f"  Device: {next(self._pipeline.model.parameters()).device}")
         return self._pipeline
     def _load_dataset(self):
@@ -109,6 +115,12 @@ class ChronosInferencePipeline:
             print(f"  Shape: {self._dataset.shape}")
             print(f"  Borders: {len(self._borders)}")
         return self._dataset, self._borders
     def run_forecast(

             print(f"Model loaded in {time.time() - start_time:.1f}s")
             print(f"  Device: {next(self._pipeline.model.parameters()).device}")
+            # Memory profiling diagnostics
+            if torch.cuda.is_available():
+                print(f"  [MEMORY] After model load:")
+                print(f"    GPU memory allocated: {torch.cuda.memory_allocated()/1e9:.2f} GB")
+                print(f"    GPU memory reserved: {torch.cuda.memory_reserved()/1e9:.2f} GB")
         return self._pipeline
     def _load_dataset(self):
             print(f"  Shape: {self._dataset.shape}")
             print(f"  Borders: {len(self._borders)}")
+            # Memory profiling diagnostics
+            if torch.cuda.is_available():
+                print(f"  [MEMORY] After dataset load:")
+                print(f"    GPU memory allocated: {torch.cuda.memory_allocated()/1e9:.2f} GB")
+                print(f"    GPU memory reserved: {torch.cuda.memory_reserved()/1e9:.2f} GB")
         return self._dataset, self._borders
     def run_forecast(