Update README.md
#1
by kashif HF Staff - opened
README.md
CHANGED
|
@@ -6,13 +6,16 @@ thumbnail: https://www.theforecastingcompany.com/images/og.png
|
|
| 6 |
tags:
|
| 7 |
- time-series
|
| 8 |
- forecasting
|
|
|
|
| 9 |
- foundation-models
|
| 10 |
- pretrained-models
|
|
|
|
|
|
|
| 11 |
- safetensors
|
| 12 |
- model_hub_mixin
|
| 13 |
- pytorch_model_hub_mixin
|
| 14 |
model-index:
|
| 15 |
-
- name: t0
|
| 16 |
results:
|
| 17 |
- task:
|
| 18 |
type: time-series-forecasting
|
|
@@ -36,42 +39,74 @@ model-index:
|
|
| 36 |
<img src="https://www.theforecastingcompany.com/logo/logo_horizontal_pride_universal.png" alt="The Forecasting Company" width="280" />
|
| 37 |
</p>
|
| 38 |
|
| 39 |
-
# `t0`
|
| 40 |
|
| 41 |
-
|
| 42 |
-
`t0` is a transformer-based model that
|
| 43 |
-
produces probabilistic multi-horizon forecasts and natively operates on
|
| 44 |
-
multiple covariates. `t0-alpha` is our first iteration of the model.
|
| 45 |
|
| 46 |
-
|
|
|
|
|
|
|
| 47 |
|
| 48 |

|
| 49 |
|
| 50 |
-
_`t0` forecasting French national electricity demand in Retrocast. Data:
|
| 51 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 52 |
|
| 53 |
-
|
| 54 |
|
| 55 |
-
|
| 56 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 57 |
|
| 58 |
| Without covariates | With covariates |
|
| 59 |
| ----------------------------------------------------------------- | ----------------------------------------------------------- |
|
| 60 |
|  |  |
|
| 61 |
|
| 62 |
-
_Data: [Medic'AM](https://www.assurance-maladie.ameli.fr/etudes-et-donnees/medicaments-classe-atc-medicam),
|
| 63 |
-
monthly drug reimbursements from the French national health insurance._
|
| 64 |
|
| 65 |
-
The [Quickstart](#
|
| 66 |
-
univariate forecast and a multivariate forecast that conditions on
|
| 67 |
-
historical and known-future covariates.
|
| 68 |
|
| 69 |
-
##
|
| 70 |
|
| 71 |
```bash
|
| 72 |
pip install tfc-t0
|
| 73 |
```
|
| 74 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 75 |
The simplest path is a univariate forecast through `predict`:
|
| 76 |
|
| 77 |
```python
|
|
@@ -86,18 +121,11 @@ out.quantiles # (4, 64, 3)
|
|
| 86 |
out.median # (4, 64)
|
| 87 |
```
|
| 88 |
|
| 89 |
-
`predict` accepts
|
| 90 |
-
single-row batch. NaN values in the context are treated as missing
|
| 91 |
-
observations.
|
| 92 |
|
| 93 |
-
### Forecasting
|
| 94 |
|
| 95 |
-
Anything
|
| 96 |
-
target, extra variates attend to it and are forecast together. Anything
|
| 97 |
-
you know over the **future** (calendar features, planned promotions,
|
| 98 |
-
weather forecasts) goes in `future_covariates`, shaped
|
| 99 |
-
`[B, F, context + horizon]`; the model conditions on it but does not
|
| 100 |
-
forecast it.
|
| 101 |
|
| 102 |
```python
|
| 103 |
import torch
|
|
@@ -118,46 +146,72 @@ out.quantiles # (2, 64, 3)
|
|
| 118 |
out.median # (2, 64)
|
| 119 |
```
|
| 120 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 121 |
## ๐๏ธ Architecture
|
| 122 |
|
| 123 |
-
`t0` is a decoder-style patch transformer
|
| 124 |
-
|
| 125 |
-
|
| 126 |
-
|
| 127 |
-
|
| 128 |
-
|
| 129 |
-
|
|
| 130 |
-
| ---
|
| 131 |
-
| Parameters
|
| 132 |
-
| Layers
|
| 133 |
-
|
|
| 134 |
-
|
|
| 135 |
-
|
|
| 136 |
-
|
|
| 137 |
-
|
|
| 138 |
-
|
| 139 |
-
|
| 140 |
-
|
| 141 |
-
|
| 142 |
-
|
| 143 |
-
|
| 144 |
-
|
| 145 |
-
|
| 146 |
-
|
| 147 |
-
|
| 148 |
-
|
| 149 |
-
|
| 150 |
-
|
| 151 |
-
|
| 152 |
-
|
| 153 |
|
| 154 |
## ๐งฐ Public API
|
| 155 |
|
| 156 |
-
- `T0Forecaster`
|
| 157 |
-
|
| 158 |
-
|
| 159 |
-
|
| 160 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 161 |
|
| 162 |
## ๐ Citation
|
| 163 |
|
|
@@ -172,6 +226,10 @@ Apache-2.0.
|
|
| 172 |
|
| 173 |
## โ๏ธ License
|
| 174 |
|
| 175 |
-
Apache-2.0
|
| 176 |
-
|
| 177 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 6 |
tags:
|
| 7 |
- time-series
|
| 8 |
- forecasting
|
| 9 |
+
- probabilistic-forecasting
|
| 10 |
- foundation-models
|
| 11 |
- pretrained-models
|
| 12 |
+
- covariates
|
| 13 |
+
- pytorch
|
| 14 |
- safetensors
|
| 15 |
- model_hub_mixin
|
| 16 |
- pytorch_model_hub_mixin
|
| 17 |
model-index:
|
| 18 |
+
- name: t0-alpha
|
| 19 |
results:
|
| 20 |
- task:
|
| 21 |
type: time-series-forecasting
|
|
|
|
| 39 |
<img src="https://www.theforecastingcompany.com/logo/logo_horizontal_pride_universal.png" alt="The Forecasting Company" width="280" />
|
| 40 |
</p>
|
| 41 |
|
| 42 |
+
# `t0-alpha`
|
| 43 |
|
| 44 |
+
`t0-alpha` is an open-weights time-series forecasting foundation model from [The Forecasting Company](https://theforecastingcompany.com/).
|
|
|
|
|
|
|
|
|
|
| 45 |
|
| 46 |
+
`t0` is a transformer-based model that produces probabilistic multi-horizon forecasts and natively operates on multiple covariates. `t0-alpha` is the first public iteration of the model.
|
| 47 |
+
|
| 48 |
+
You can use `t0` on [Retrocast](https://app.retrocast.com/), The Forecasting Company's platform for forecasting on your own data and comparing forecasts across open-weight models.
|
| 49 |
|
| 50 |

|
| 51 |
|
| 52 |
+
_`t0` forecasting French national electricity demand in Retrocast. Data: [Enedis open data](https://data.enedis.fr/)._
|
| 53 |
+
|
| 54 |
+
## Model Details
|
| 55 |
+
|
| 56 |
+
- Model name: `t0-alpha`
|
| 57 |
+
- Model family: `t0`
|
| 58 |
+
- Developer: [The Forecasting Company](https://theforecastingcompany.com/)
|
| 59 |
+
- Task: probabilistic time-series forecasting
|
| 60 |
+
- Architecture: decoder-style patch transformer
|
| 61 |
+
- Parameters: approximately 102M
|
| 62 |
+
- License: Apache-2.0
|
| 63 |
+
- Weights: https://huggingface.co/theforecastingcompany/t0-alpha
|
| 64 |
+
- Source code: https://github.com/theforecastingcompany/tfc-t0
|
| 65 |
+
- Issues: https://github.com/theforecastingcompany/tfc-t0/issues
|
| 66 |
+
- Package: `tfc-t0`
|
| 67 |
+
|
| 68 |
+
`t0-alpha` is an alpha release intended for research, experimentation, and applied forecasting evaluation.
|
| 69 |
+
|
| 70 |
+
## Intended Use
|
| 71 |
|
| 72 |
+
`t0-alpha` is intended for probabilistic time-series forecasting. It can be used for univariate and multivariate forecasting, forecasting with historical or known-future covariates and multi-horizon forecasting.
|
| 73 |
|
| 74 |
+
Known-future covariates can include calendar features, planned events, holidays, promotions, weather forecasts, or other external signals available over the forecast horizon.
|
| 75 |
+
|
| 76 |
+
Forecasts should be treated as probabilistic estimates, not guarantees.
|
| 77 |
+
|
| 78 |
+
## ๐ Forecasting With Covariates
|
| 79 |
+
|
| 80 |
+
`t0` leverages covariate information, in the past and future when available, to improve its forecast.
|
| 81 |
|
| 82 |
| Without covariates | With covariates |
|
| 83 |
| ----------------------------------------------------------------- | ----------------------------------------------------------- |
|
| 84 |
|  |  |
|
| 85 |
|
| 86 |
+
_Data: [Medic'AM](https://www.assurance-maladie.ameli.fr/etudes-et-donnees/medicaments-classe-atc-medicam), monthly drug reimbursements from the French national health insurance._
|
|
|
|
| 87 |
|
| 88 |
+
The [Quickstart](#quickstart) below shows the API for both a plain univariate forecast and a multivariate forecast that conditions on historical and known-future covariates.
|
|
|
|
|
|
|
| 89 |
|
| 90 |
+
## Installation
|
| 91 |
|
| 92 |
```bash
|
| 93 |
pip install tfc-t0
|
| 94 |
```
|
| 95 |
|
| 96 |
+
Requirements:
|
| 97 |
+
|
| 98 |
+
- Python `>=3.11,<3.14`
|
| 99 |
+
- PyTorch `>=2.4,<3`
|
| 100 |
+
|
| 101 |
+
Optional extras:
|
| 102 |
+
|
| 103 |
+
```bash
|
| 104 |
+
pip install "tfc-t0[evaluation]"
|
| 105 |
+
pip install "tfc-t0[plot]"
|
| 106 |
+
```
|
| 107 |
+
|
| 108 |
+
## ๐ Quickstart
|
| 109 |
+
|
| 110 |
The simplest path is a univariate forecast through `predict`:
|
| 111 |
|
| 112 |
```python
|
|
|
|
| 121 |
out.median # (4, 64)
|
| 122 |
```
|
| 123 |
|
| 124 |
+
`predict` accepts PyTorch tensors and NumPy arrays.
|
|
|
|
|
|
|
| 125 |
|
| 126 |
+
### Forecasting With Covariates
|
| 127 |
|
| 128 |
+
Anything known over the past goes in `context`. Alongside the target, extra variates attend to it and are forecast together. Anything known over the future, such as calendar features, planned promotions, or weather forecasts, goes in `future_covariates`, shaped `[B, F, context + horizon]`; the model conditions on it but does not forecast it.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 129 |
|
| 130 |
```python
|
| 131 |
import torch
|
|
|
|
| 146 |
out.median # (2, 64)
|
| 147 |
```
|
| 148 |
|
| 149 |
+
## Input Contract
|
| 150 |
+
|
| 151 |
+
- `context` may be shaped `(B, T)` for batched univariate forecasting.
|
| 152 |
+
- `context` may also be shaped `(T,)`; it is promoted to a single-row batch.
|
| 153 |
+
- `context` may be shaped `(B, V, T)` for multiple target variates.
|
| 154 |
+
- `future_covariates`, when provided, should be shaped `(B, F, context + horizon)`.
|
| 155 |
+
- NaN values are treated as missing observations.
|
| 156 |
+
- `horizon` must be at least 1.
|
| 157 |
+
- Requested quantiles must be non-empty, sorted ascending, unique, and in `(0, 1)`.
|
| 158 |
+
- The model was trained to emit quantiles `0.1`, `0.25`, `0.5`, `0.75`, and `0.9`.
|
| 159 |
+
- Requested quantiles are produced by inference-time interpolation when needed.
|
| 160 |
+
- Horizons up to 1024 timesteps are decoded in one forward pass.
|
| 161 |
+
- Longer horizons use autoregressive rollout.
|
| 162 |
+
- Returned forecasts are finite `float32` tensors on the model's device.
|
| 163 |
+
|
| 164 |
## ๐๏ธ Architecture
|
| 165 |
|
| 166 |
+
`t0` is a decoder-style patch transformer.
|
| 167 |
+
|
| 168 |
+
It encodes each patch from values, within-patch time index, and validity mask. The transformer alternates causal time-axis self-attention with variate-axis group self-attention. Time attention uses time-aware rotary embeddings. Variate attention lets variates in the same sample attend to one another. The stack uses pre-norm RMSNorm blocks, SwiGLU feed-forward layers, and a quantile head.
|
| 169 |
+
|
| 170 |
+
At inference, target and historical variates are normalized with causal running statistics. Future covariates use per-row global statistics.
|
| 171 |
+
|
| 172 |
+
| Field | Value |
|
| 173 |
+
| --- | --- |
|
| 174 |
+
| Parameters | approximately 102M |
|
| 175 |
+
| Layers | 24 |
|
| 176 |
+
| Layer pattern | 2 time-attention layers, then 1 group-attention layer |
|
| 177 |
+
| Time attention layers | 16 |
|
| 178 |
+
| Group attention layers | 8 |
|
| 179 |
+
| Embedding dim | 512 |
|
| 180 |
+
| Feedforward dim | 2048 |
|
| 181 |
+
| Attention heads | 8 |
|
| 182 |
+
| Patch size | 32 |
|
| 183 |
+
| Dropout | 0.1 |
|
| 184 |
+
| Scaler | causal mean/std with `arcsinh` transform |
|
| 185 |
+
| Native quantile levels | 0.1, 0.25, 0.5, 0.75, 0.9 |
|
| 186 |
+
|
| 187 |
+
## Evaluation
|
| 188 |
+
|
| 189 |
+
`t0-alpha` is reported on the [GIFT-Eval leaderboard](https://huggingface.co/spaces/Salesforce/GIFT-Eval).
|
| 190 |
+
|
| 191 |
+
| Benchmark | CRPS | MASE |
|
| 192 |
+
| --- | ---: | ---: |
|
| 193 |
+
| GIFT-Eval | 0.4941 | 0.7240 |
|
| 194 |
+
|
| 195 |
+
Users should also evaluate `t0-alpha` on their own historical backtests. Useful checks include quantile loss, CRPS, MASE, empirical quantile coverage, calibration, and breakdowns by frequency, horizon, domain, history length, and covariate availability.
|
| 196 |
|
| 197 |
## ๐งฐ Public API
|
| 198 |
|
| 199 |
+
- `T0Forecaster`: The actual PyTorch module class.
|
| 200 |
+
- `Forecast`: return object encapsulating forecasted quantiles.
|
| 201 |
+
- `T0Config`: dataclass to configure the model.
|
| 202 |
+
|
| 203 |
+
## ๐งฌ Lineage and Attributions
|
| 204 |
+
|
| 205 |
+
`t0` builds on ideas from open-source forecasting models. We gratefully acknowledge:
|
| 206 |
+
|
| 207 |
+
- **Toto** by Datadog ([repo](https://github.com/DataDog/toto)) and **Chronos-2** by Amazon ([repo](https://github.com/amazon-science/chronos-forecasting)) for factorizing attention in the time and variates dimension.
|
| 208 |
+
- **TiRex** by NXAI ([repo](https://github.com/NX-AI/tirex)) for contiguous patch masking.
|
| 209 |
+
|
| 210 |
+
Code-level attributions are listed in [`NOTICE`](NOTICE), all under Apache-2.0.
|
| 211 |
+
|
| 212 |
+
## Environmental Impact
|
| 213 |
+
|
| 214 |
+
Training compute and carbon emissions are not currently reported.
|
| 215 |
|
| 216 |
## ๐ Citation
|
| 217 |
|
|
|
|
| 226 |
|
| 227 |
## โ๏ธ License
|
| 228 |
|
| 229 |
+
Apache-2.0. See [`LICENSE`](LICENSE) and [`NOTICE`](NOTICE).
|
| 230 |
+
|
| 231 |
+
## Contact
|
| 232 |
+
|
| 233 |
+
For issues, bug reports, or API questions, please use the GitHub issue tracker:
|
| 234 |
+
|
| 235 |
+
https://github.com/theforecastingcompany/tfc-t0/issues
|