LTMeyer commited on
Commit
01a3e2f
·
unverified ·
1 Parent(s): aab91c1

doc: edit README

Browse files
Files changed (1) hide show
  1. README.md +11 -27
README.md CHANGED
@@ -65,19 +65,15 @@ _`t0` forecasting French national electricity demand in Retrocast. Data: [Enedis
65
  - Issues: https://github.com/theforecastingcompany/tfc-t0/issues
66
  - Package: `tfc-t0`
67
 
68
- `t0-alpha` is an alpha release intended for research, experimentation, and applied forecasting evaluation. Users should validate the model on their own historical data before production use.
69
 
70
  ## Intended Use
71
 
72
- `t0-alpha` is intended for probabilistic time-series forecasting. It can be used for univariate and multivariate forecasting, multi-horizon forecasting, and forecasting with historical or known-future covariates.
73
 
74
  Known-future covariates can include calendar features, planned events, holidays, promotions, weather forecasts, or other external signals available over the forecast horizon.
75
 
76
- ## Out-of-Scope Use
77
-
78
- `t0-alpha` is not intended for causal inference, anomaly root-cause analysis, automated safety-critical decisions, financial advice, autonomous trading decisions, or deployment without task-specific backtesting.
79
-
80
- Forecasts should be treated as probabilistic estimates, not guarantees. Forecast quality may degrade under distribution shift, structural breaks, rare events, missing or inaccurate future covariates, or domains that differ substantially from the model's training distribution.
81
 
82
  ## Forecasting With Covariates
83
 
@@ -125,7 +121,7 @@ out.quantiles # (4, 64, 3)
125
  out.median # (4, 64)
126
  ```
127
 
128
- `predict` accepts PyTorch tensors and NumPy arrays. One-dimensional contexts are promoted to a single-row batch. NaN values in the context are treated as missing observations.
129
 
130
  ### Forecasting With Covariates
131
 
@@ -167,11 +163,11 @@ out.median # (2, 64)
167
 
168
  ## Architecture
169
 
170
- `t0` is a decoder-style patch transformer. It encodes each patch from values, within-patch time index, and validity mask, then adds learned type embeddings for target, historical, and future variates.
171
 
172
- The transformer alternates causal time-axis self-attention with variate-axis group self-attention. Time attention uses time-aware rotary embeddings with XPos-style scaling; variate attention lets variates in the same sample attend to one another without positional ordering. The stack uses pre-norm RMSNorm blocks, SwiGLU feed-forward layers, and a quantile head that enforces monotonic quantiles with a cumulative softplus parameterization.
173
 
174
- At inference, target and historical variates are normalized with causal running statistics. Future covariates use per-row global statistics. The published configuration applies an `arcsinh` transform after standardization and inverts it when rescaling forecasts.
175
 
176
  | Field | Value |
177
  | --- | --- |
@@ -196,25 +192,13 @@ At inference, target and historical variates are normalized with causal running
196
  | --- | ---: | ---: |
197
  | GIFT-Eval | 0.4941 | 0.7240 |
198
 
199
- Users should also evaluate `t0-alpha` on their own historical backtests before relying on forecasts in production. Useful checks include quantile loss, CRPS, MASE, empirical quantile coverage, calibration, and breakdowns by frequency, horizon, domain, history length, and covariate availability.
200
-
201
- ## Training Data
202
-
203
- Training data details are not currently reported in this model card.
204
-
205
- Future revisions should describe the training data domains, time-series frequencies, date ranges, preprocessing and normalization, whether synthetic data was used, whether proprietary or private data was used, whether covariates were included during training, and known exclusions or filtering rules.
206
-
207
- ## Limitations
208
-
209
- Known or expected limitations include reduced reliability under distribution shift, reduced reliability during structural breaks or rare events, possible degradation with very short histories, possible degradation with irregular timestamps, sensitivity to missing or inaccurate future covariates, potential error accumulation for long horizons that require autoregressive rollout, and uncertain performance on domains not represented in the training data.
210
-
211
- Forecasts should be monitored over time for calibration, empirical coverage, and task-specific error.
212
 
213
  ## Public API
214
 
215
- - `T0Forecaster`: PyTorch `nn.Module` with `from_pretrained`, `save_pretrained`, and the user-facing `predict(context, horizon, quantiles, future_covariates)`.
216
- - `Forecast`: return object containing predicted quantiles, requested quantile levels, and median forecast.
217
- - `T0Config`: frozen dataclass. `T0Config.medium()` is the published configuration.
218
 
219
  ## Lineage and Attributions
220
 
 
65
  - Issues: https://github.com/theforecastingcompany/tfc-t0/issues
66
  - Package: `tfc-t0`
67
 
68
+ `t0-alpha` is an alpha release intended for research, experimentation, and applied forecasting evaluation.
69
 
70
  ## Intended Use
71
 
72
+ `t0-alpha` is intended for probabilistic time-series forecasting. It can be used for univariate and multivariate forecasting, forecasting with historical or known-future covariates and multi-horizon forecasting.
73
 
74
  Known-future covariates can include calendar features, planned events, holidays, promotions, weather forecasts, or other external signals available over the forecast horizon.
75
 
76
+ Forecasts should be treated as probabilistic estimates, not guarantees.
 
 
 
 
77
 
78
  ## Forecasting With Covariates
79
 
 
121
  out.median # (4, 64)
122
  ```
123
 
124
+ `predict` accepts PyTorch tensors and NumPy arrays.
125
 
126
  ### Forecasting With Covariates
127
 
 
163
 
164
  ## Architecture
165
 
166
+ `t0` is a decoder-style patch transformer.
167
 
168
+ It encodes each patch from values, within-patch time index, and validity mask. The transformer alternates causal time-axis self-attention with variate-axis group self-attention. Time attention uses time-aware rotary embeddings. Variate attention lets variates in the same sample attend to one another. The stack uses pre-norm RMSNorm blocks, SwiGLU feed-forward layers, and a quantile head.
169
 
170
+ At inference, target and historical variates are normalized with causal running statistics. Future covariates use per-row global statistics.
171
 
172
  | Field | Value |
173
  | --- | --- |
 
192
  | --- | ---: | ---: |
193
  | GIFT-Eval | 0.4941 | 0.7240 |
194
 
195
+ Users should also evaluate `t0-alpha` on their own historical backtests. Useful checks include quantile loss, CRPS, MASE, empirical quantile coverage, calibration, and breakdowns by frequency, horizon, domain, history length, and covariate availability.
 
 
 
 
 
 
 
 
 
 
 
 
196
 
197
  ## Public API
198
 
199
+ - `T0Forecaster`: The actual PyTorch module class.
200
+ - `Forecast`: return object encapsulating forecasted quantiles.
201
+ - `T0Config`: dataclass to configure the model.
202
 
203
  ## Lineage and Attributions
204