Update README.md

#1
by kashif HF Staff - opened
Files changed (1) hide show
  1. README.md +124 -66
README.md CHANGED
@@ -6,13 +6,16 @@ thumbnail: https://www.theforecastingcompany.com/images/og.png
6
  tags:
7
  - time-series
8
  - forecasting
 
9
  - foundation-models
10
  - pretrained-models
 
 
11
  - safetensors
12
  - model_hub_mixin
13
  - pytorch_model_hub_mixin
14
  model-index:
15
- - name: t0
16
  results:
17
  - task:
18
  type: time-series-forecasting
@@ -36,42 +39,74 @@ model-index:
36
  <img src="https://www.theforecastingcompany.com/logo/logo_horizontal_pride_universal.png" alt="The Forecasting Company" width="280" />
37
  </p>
38
 
39
- # `t0`
40
 
41
- Open-weights time-series forecasting foundation model from [The Forecasting Company](https://theforecastingcompany.com/).
42
- `t0` is a transformer-based model that
43
- produces probabilistic multi-horizon forecasts and natively operates on
44
- multiple covariates. `t0-alpha` is our first iteration of the model.
45
 
46
- You can use `t0` on [Retrocast](https://app.retrocast.com/), our platform for forecasting on your own data. You can also compare forecast across different open-weight models.
 
 
47
 
48
  ![t0 forecasting French national electricity demand in Retrocast](assets/enedis_with_holidays.png)
49
 
50
- _`t0` forecasting French national electricity demand in Retrocast. Data:
51
- [Enedis open data](https://data.enedis.fr/)._
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
52
 
53
- ## ๐Ÿ“ˆ Forecasting with covariates
54
 
55
- `t0` leverages covariate information, in the past and future when
56
- available, to improve its forecast.
 
 
 
 
 
57
 
58
  | Without covariates | With covariates |
59
  | ----------------------------------------------------------------- | ----------------------------------------------------------- |
60
  | ![t0 forecast without covariates](assets/medicam_without_cov.png) | ![t0 forecast with covariates](assets/medicam_with_cov.png) |
61
 
62
- _Data: [Medic'AM](https://www.assurance-maladie.ameli.fr/etudes-et-donnees/medicaments-classe-atc-medicam),
63
- monthly drug reimbursements from the French national health insurance._
64
 
65
- The [Quickstart](#-quickstart) below shows the API for both a plain
66
- univariate forecast and a multivariate forecast that conditions on
67
- historical and known-future covariates.
68
 
69
- ## ๐Ÿš€ Quickstart
70
 
71
  ```bash
72
  pip install tfc-t0
73
  ```
74
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
75
  The simplest path is a univariate forecast through `predict`:
76
 
77
  ```python
@@ -86,18 +121,11 @@ out.quantiles # (4, 64, 3)
86
  out.median # (4, 64)
87
  ```
88
 
89
- `predict` accepts `numpy` arrays. 1-D contexts are auto-promoted to a
90
- single-row batch. NaN values in the context are treated as missing
91
- observations.
92
 
93
- ### Forecasting with covariates
94
 
95
- Anything you know over the **past** goes in `context` โ€” alongside the
96
- target, extra variates attend to it and are forecast together. Anything
97
- you know over the **future** (calendar features, planned promotions,
98
- weather forecasts) goes in `future_covariates`, shaped
99
- `[B, F, context + horizon]`; the model conditions on it but does not
100
- forecast it.
101
 
102
  ```python
103
  import torch
@@ -118,46 +146,72 @@ out.quantiles # (2, 64, 3)
118
  out.median # (2, 64)
119
  ```
120
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
121
  ## ๐Ÿ—๏ธ Architecture
122
 
123
- `t0` is a decoder-style patch transformer that alternates time and
124
- covariate attention layers. It predicts 5 quantiles (0.1, 0.25, 0.5,
125
- 0.75, 0.9), decoding multiple horizons in parallel โ€” up to 1024
126
- timesteps in one forward pass โ€” and falling back on autoregressive
127
- rollout for longer horizons.
128
-
129
- | | |
130
- | --------------- | ------------------------- |
131
- | Parameters | ~102M |
132
- | Layers | 24 |
133
- | Embedding dim | 512 |
134
- | Feedforward dim | 2048 |
135
- | Attention heads | 8 |
136
- | Patch size | 32 |
137
- | Quantile levels | 0.1, 0.25, 0.5, 0.75, 0.9 |
138
-
139
- ### ๐Ÿงฌ Lineage
140
-
141
- `t0` builds on ideas โ€” and in places, code โ€” from open-source forecasting
142
- models. We gratefully acknowledge:
143
-
144
- - **Toto** by Datadog ([repo](https://github.com/DataDog/toto)) &
145
- **Chronos-2** by Amazon
146
- ([repo](https://github.com/amazon-science/chronos-forecasting)) โ€”
147
- factorizing attention in the time and variates dimension.
148
- - **TiRex** by NXAI
149
- ([repo](https://github.com/NX-AI/tirex)) โ€” contiguous patch masking.
150
-
151
- Code-level attributions are listed in [`NOTICE`](NOTICE), all under
152
- Apache-2.0.
153
 
154
  ## ๐Ÿงฐ Public API
155
 
156
- - `T0Forecaster` โ€” `nn.Module` with `from_pretrained` /
157
- `save_pretrained` (via `huggingface_hub.PyTorchModelHubMixin`) and the
158
- user-facing `predict(context, horizon, quantiles, future_covariates)`.
159
- - `T0Config` โ€” frozen dataclass; `T0Config.medium()` is the published
160
- configuration.
 
 
 
 
 
 
 
 
 
 
 
161
 
162
  ## ๐Ÿ“š Citation
163
 
@@ -172,6 +226,10 @@ Apache-2.0.
172
 
173
  ## โš–๏ธ License
174
 
175
- Apache-2.0 โ€” see [LICENSE](LICENSE) and [NOTICE](NOTICE).
176
- </content>
177
- </invoke>
 
 
 
 
 
6
  tags:
7
  - time-series
8
  - forecasting
9
+ - probabilistic-forecasting
10
  - foundation-models
11
  - pretrained-models
12
+ - covariates
13
+ - pytorch
14
  - safetensors
15
  - model_hub_mixin
16
  - pytorch_model_hub_mixin
17
  model-index:
18
+ - name: t0-alpha
19
  results:
20
  - task:
21
  type: time-series-forecasting
 
39
  <img src="https://www.theforecastingcompany.com/logo/logo_horizontal_pride_universal.png" alt="The Forecasting Company" width="280" />
40
  </p>
41
 
42
+ # `t0-alpha`
43
 
44
+ `t0-alpha` is an open-weights time-series forecasting foundation model from [The Forecasting Company](https://theforecastingcompany.com/).
 
 
 
45
 
46
+ `t0` is a transformer-based model that produces probabilistic multi-horizon forecasts and natively operates on multiple covariates. `t0-alpha` is the first public iteration of the model.
47
+
48
+ You can use `t0` on [Retrocast](https://app.retrocast.com/), The Forecasting Company's platform for forecasting on your own data and comparing forecasts across open-weight models.
49
 
50
  ![t0 forecasting French national electricity demand in Retrocast](assets/enedis_with_holidays.png)
51
 
52
+ _`t0` forecasting French national electricity demand in Retrocast. Data: [Enedis open data](https://data.enedis.fr/)._
53
+
54
+ ## Model Details
55
+
56
+ - Model name: `t0-alpha`
57
+ - Model family: `t0`
58
+ - Developer: [The Forecasting Company](https://theforecastingcompany.com/)
59
+ - Task: probabilistic time-series forecasting
60
+ - Architecture: decoder-style patch transformer
61
+ - Parameters: approximately 102M
62
+ - License: Apache-2.0
63
+ - Weights: https://huggingface.co/theforecastingcompany/t0-alpha
64
+ - Source code: https://github.com/theforecastingcompany/tfc-t0
65
+ - Issues: https://github.com/theforecastingcompany/tfc-t0/issues
66
+ - Package: `tfc-t0`
67
+
68
+ `t0-alpha` is an alpha release intended for research, experimentation, and applied forecasting evaluation.
69
+
70
+ ## Intended Use
71
 
72
+ `t0-alpha` is intended for probabilistic time-series forecasting. It can be used for univariate and multivariate forecasting, forecasting with historical or known-future covariates and multi-horizon forecasting.
73
 
74
+ Known-future covariates can include calendar features, planned events, holidays, promotions, weather forecasts, or other external signals available over the forecast horizon.
75
+
76
+ Forecasts should be treated as probabilistic estimates, not guarantees.
77
+
78
+ ## ๐Ÿ“ˆ Forecasting With Covariates
79
+
80
+ `t0` leverages covariate information, in the past and future when available, to improve its forecast.
81
 
82
  | Without covariates | With covariates |
83
  | ----------------------------------------------------------------- | ----------------------------------------------------------- |
84
  | ![t0 forecast without covariates](assets/medicam_without_cov.png) | ![t0 forecast with covariates](assets/medicam_with_cov.png) |
85
 
86
+ _Data: [Medic'AM](https://www.assurance-maladie.ameli.fr/etudes-et-donnees/medicaments-classe-atc-medicam), monthly drug reimbursements from the French national health insurance._
 
87
 
88
+ The [Quickstart](#quickstart) below shows the API for both a plain univariate forecast and a multivariate forecast that conditions on historical and known-future covariates.
 
 
89
 
90
+ ## Installation
91
 
92
  ```bash
93
  pip install tfc-t0
94
  ```
95
 
96
+ Requirements:
97
+
98
+ - Python `>=3.11,<3.14`
99
+ - PyTorch `>=2.4,<3`
100
+
101
+ Optional extras:
102
+
103
+ ```bash
104
+ pip install "tfc-t0[evaluation]"
105
+ pip install "tfc-t0[plot]"
106
+ ```
107
+
108
+ ## ๐Ÿš€ Quickstart
109
+
110
  The simplest path is a univariate forecast through `predict`:
111
 
112
  ```python
 
121
  out.median # (4, 64)
122
  ```
123
 
124
+ `predict` accepts PyTorch tensors and NumPy arrays.
 
 
125
 
126
+ ### Forecasting With Covariates
127
 
128
+ Anything known over the past goes in `context`. Alongside the target, extra variates attend to it and are forecast together. Anything known over the future, such as calendar features, planned promotions, or weather forecasts, goes in `future_covariates`, shaped `[B, F, context + horizon]`; the model conditions on it but does not forecast it.
 
 
 
 
 
129
 
130
  ```python
131
  import torch
 
146
  out.median # (2, 64)
147
  ```
148
 
149
+ ## Input Contract
150
+
151
+ - `context` may be shaped `(B, T)` for batched univariate forecasting.
152
+ - `context` may also be shaped `(T,)`; it is promoted to a single-row batch.
153
+ - `context` may be shaped `(B, V, T)` for multiple target variates.
154
+ - `future_covariates`, when provided, should be shaped `(B, F, context + horizon)`.
155
+ - NaN values are treated as missing observations.
156
+ - `horizon` must be at least 1.
157
+ - Requested quantiles must be non-empty, sorted ascending, unique, and in `(0, 1)`.
158
+ - The model was trained to emit quantiles `0.1`, `0.25`, `0.5`, `0.75`, and `0.9`.
159
+ - Requested quantiles are produced by inference-time interpolation when needed.
160
+ - Horizons up to 1024 timesteps are decoded in one forward pass.
161
+ - Longer horizons use autoregressive rollout.
162
+ - Returned forecasts are finite `float32` tensors on the model's device.
163
+
164
  ## ๐Ÿ—๏ธ Architecture
165
 
166
+ `t0` is a decoder-style patch transformer.
167
+
168
+ It encodes each patch from values, within-patch time index, and validity mask. The transformer alternates causal time-axis self-attention with variate-axis group self-attention. Time attention uses time-aware rotary embeddings. Variate attention lets variates in the same sample attend to one another. The stack uses pre-norm RMSNorm blocks, SwiGLU feed-forward layers, and a quantile head.
169
+
170
+ At inference, target and historical variates are normalized with causal running statistics. Future covariates use per-row global statistics.
171
+
172
+ | Field | Value |
173
+ | --- | --- |
174
+ | Parameters | approximately 102M |
175
+ | Layers | 24 |
176
+ | Layer pattern | 2 time-attention layers, then 1 group-attention layer |
177
+ | Time attention layers | 16 |
178
+ | Group attention layers | 8 |
179
+ | Embedding dim | 512 |
180
+ | Feedforward dim | 2048 |
181
+ | Attention heads | 8 |
182
+ | Patch size | 32 |
183
+ | Dropout | 0.1 |
184
+ | Scaler | causal mean/std with `arcsinh` transform |
185
+ | Native quantile levels | 0.1, 0.25, 0.5, 0.75, 0.9 |
186
+
187
+ ## Evaluation
188
+
189
+ `t0-alpha` is reported on the [GIFT-Eval leaderboard](https://huggingface.co/spaces/Salesforce/GIFT-Eval).
190
+
191
+ | Benchmark | CRPS | MASE |
192
+ | --- | ---: | ---: |
193
+ | GIFT-Eval | 0.4941 | 0.7240 |
194
+
195
+ Users should also evaluate `t0-alpha` on their own historical backtests. Useful checks include quantile loss, CRPS, MASE, empirical quantile coverage, calibration, and breakdowns by frequency, horizon, domain, history length, and covariate availability.
196
 
197
  ## ๐Ÿงฐ Public API
198
 
199
+ - `T0Forecaster`: The actual PyTorch module class.
200
+ - `Forecast`: return object encapsulating forecasted quantiles.
201
+ - `T0Config`: dataclass to configure the model.
202
+
203
+ ## ๐Ÿงฌ Lineage and Attributions
204
+
205
+ `t0` builds on ideas from open-source forecasting models. We gratefully acknowledge:
206
+
207
+ - **Toto** by Datadog ([repo](https://github.com/DataDog/toto)) and **Chronos-2** by Amazon ([repo](https://github.com/amazon-science/chronos-forecasting)) for factorizing attention in the time and variates dimension.
208
+ - **TiRex** by NXAI ([repo](https://github.com/NX-AI/tirex)) for contiguous patch masking.
209
+
210
+ Code-level attributions are listed in [`NOTICE`](NOTICE), all under Apache-2.0.
211
+
212
+ ## Environmental Impact
213
+
214
+ Training compute and carbon emissions are not currently reported.
215
 
216
  ## ๐Ÿ“š Citation
217
 
 
226
 
227
  ## โš–๏ธ License
228
 
229
+ Apache-2.0. See [`LICENSE`](LICENSE) and [`NOTICE`](NOTICE).
230
+
231
+ ## Contact
232
+
233
+ For issues, bug reports, or API questions, please use the GitHub issue tracker:
234
+
235
+ https://github.com/theforecastingcompany/tfc-t0/issues