Evgueni Poloukarov commited on
Commit
3254242
·
1 Parent(s): 5ff6f25

debug: add GPU memory profiling diagnostics to identify 36GB baseline usage

Browse files

Investigation: Baseline memory consumption shows 36GB when expected is ~2GB
- Model: 710M params * 2 bytes (bfloat16) = 1.4GB expected
- Dataset: ~50MB expected
- Actual: 36.09GB allocated by PyTorch (34GB unaccounted!)

Added memory profiling:
- After model loading (chronos_inference.py:86-90)
- After dataset loading (chronos_inference.py:118-122)
- Logs GPU memory allocated and reserved at each step

This will help identify which loading step consumes the unexpected 34GB.

Related: OOM errors on 48GB GPU when trying 1,440-hour context window

src/forecasting/chronos_inference.py CHANGED
@@ -83,6 +83,12 @@ class ChronosInferencePipeline:
83
  print(f"Model loaded in {time.time() - start_time:.1f}s")
84
  print(f" Device: {next(self._pipeline.model.parameters()).device}")
85
 
 
 
 
 
 
 
86
  return self._pipeline
87
 
88
  def _load_dataset(self):
@@ -109,6 +115,12 @@ class ChronosInferencePipeline:
109
  print(f" Shape: {self._dataset.shape}")
110
  print(f" Borders: {len(self._borders)}")
111
 
 
 
 
 
 
 
112
  return self._dataset, self._borders
113
 
114
  def run_forecast(
 
83
  print(f"Model loaded in {time.time() - start_time:.1f}s")
84
  print(f" Device: {next(self._pipeline.model.parameters()).device}")
85
 
86
+ # Memory profiling diagnostics
87
+ if torch.cuda.is_available():
88
+ print(f" [MEMORY] After model load:")
89
+ print(f" GPU memory allocated: {torch.cuda.memory_allocated()/1e9:.2f} GB")
90
+ print(f" GPU memory reserved: {torch.cuda.memory_reserved()/1e9:.2f} GB")
91
+
92
  return self._pipeline
93
 
94
  def _load_dataset(self):
 
115
  print(f" Shape: {self._dataset.shape}")
116
  print(f" Borders: {len(self._borders)}")
117
 
118
+ # Memory profiling diagnostics
119
+ if torch.cuda.is_available():
120
+ print(f" [MEMORY] After dataset load:")
121
+ print(f" GPU memory allocated: {torch.cuda.memory_allocated()/1e9:.2f} GB")
122
+ print(f" GPU memory reserved: {torch.cuda.memory_reserved()/1e9:.2f} GB")
123
+
124
  return self._dataset, self._borders
125
 
126
  def run_forecast(