Spaces:
Sleeping
Sleeping
Evgueni Poloukarov
commited on
Commit
·
3254242
1
Parent(s):
5ff6f25
debug: add GPU memory profiling diagnostics to identify 36GB baseline usage
Browse filesInvestigation: Baseline memory consumption shows 36GB when expected is ~2GB
- Model: 710M params * 2 bytes (bfloat16) = 1.4GB expected
- Dataset: ~50MB expected
- Actual: 36.09GB allocated by PyTorch (34GB unaccounted!)
Added memory profiling:
- After model loading (chronos_inference.py:86-90)
- After dataset loading (chronos_inference.py:118-122)
- Logs GPU memory allocated and reserved at each step
This will help identify which loading step consumes the unexpected 34GB.
Related: OOM errors on 48GB GPU when trying 1,440-hour context window
src/forecasting/chronos_inference.py
CHANGED
|
@@ -83,6 +83,12 @@ class ChronosInferencePipeline:
|
|
| 83 |
print(f"Model loaded in {time.time() - start_time:.1f}s")
|
| 84 |
print(f" Device: {next(self._pipeline.model.parameters()).device}")
|
| 85 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 86 |
return self._pipeline
|
| 87 |
|
| 88 |
def _load_dataset(self):
|
|
@@ -109,6 +115,12 @@ class ChronosInferencePipeline:
|
|
| 109 |
print(f" Shape: {self._dataset.shape}")
|
| 110 |
print(f" Borders: {len(self._borders)}")
|
| 111 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 112 |
return self._dataset, self._borders
|
| 113 |
|
| 114 |
def run_forecast(
|
|
|
|
| 83 |
print(f"Model loaded in {time.time() - start_time:.1f}s")
|
| 84 |
print(f" Device: {next(self._pipeline.model.parameters()).device}")
|
| 85 |
|
| 86 |
+
# Memory profiling diagnostics
|
| 87 |
+
if torch.cuda.is_available():
|
| 88 |
+
print(f" [MEMORY] After model load:")
|
| 89 |
+
print(f" GPU memory allocated: {torch.cuda.memory_allocated()/1e9:.2f} GB")
|
| 90 |
+
print(f" GPU memory reserved: {torch.cuda.memory_reserved()/1e9:.2f} GB")
|
| 91 |
+
|
| 92 |
return self._pipeline
|
| 93 |
|
| 94 |
def _load_dataset(self):
|
|
|
|
| 115 |
print(f" Shape: {self._dataset.shape}")
|
| 116 |
print(f" Borders: {len(self._borders)}")
|
| 117 |
|
| 118 |
+
# Memory profiling diagnostics
|
| 119 |
+
if torch.cuda.is_available():
|
| 120 |
+
print(f" [MEMORY] After dataset load:")
|
| 121 |
+
print(f" GPU memory allocated: {torch.cuda.memory_allocated()/1e9:.2f} GB")
|
| 122 |
+
print(f" GPU memory reserved: {torch.cuda.memory_reserved()/1e9:.2f} GB")
|
| 123 |
+
|
| 124 |
return self._dataset, self._borders
|
| 125 |
|
| 126 |
def run_forecast(
|