Google's 200M-parameter time-series foundation model with 16k context

(github.com)

50 points | by codepawl 1 hour ago

7 comments

EmilStenstrom 1 hour ago
I somehow find the concept of a general time series model strange. How can the same model predict egg prices in Italy, and global inflation in a reliable way?
And how would you even use this model, given that there are no explanations that help you trust where the prediction comes from…
[-]
- teruakohatu 53 minutes ago
  What is not generally understood is that these models don’t predict egg prices or inflation in Italy.
  They decompose a time series into trends, seasonality and residuals. That’s what they are actually modelling.
  They cannot predict wars in the Middle East influencing inflation unless there is a seasonal pattern(s).
  [-]
  - cybrox 38 minutes ago
    Wars in the middle east seem to have increasingly regular patterns tied to stock market opening hours, unfortunately.
  - visarga 42 minutes ago
    ARIMA and ARMA models
  - d--b 39 minutes ago
    The main issue is that people do use them to predict bitcoin prices intraday and that sort of things.
    [-]
    - nico 20 minutes ago
      Is it an issue because it works, or because it doesn’t? Or because it’s bitcoin?
      I genuinely want to know. Thank you
  - pasanhk 29 minutes ago
    [dead]
- lovelearning 37 minutes ago
  My understanding is that the synthetic training data helps capture abstract time-series patterns that are common in all domains.
  As they say in appendix 8:
  > We create the synthetic data to reflect common time-series patterns using traditional statistical models. We start with four simple times series patterns:
  > • Piece-wise linear trends (I), where the number of the piece-wise linear components is randomly chosen between 2 and 8.
  > • ARMA(p, q) (II), where 1 ≤ p, q ≤ 8 and the corresponding coefficients are generated from either a multivariate Gaussian or a uniform, then normalized.
  > • Seasonal patterns. In particular we create the sine (III) and the cosine (IV) waves of different random periods between 4 and max context length / 2 time-points and time delays.
  If there were no such underlying patterns in the class of all time-series data, then even the idea of traditional time-series models would be fundamentally misplaced.
  And since this is a transformer model, it also looks for patterns in the problem-specific input data at inference time, just like how the input context to an LLM influences its output's relevance.
- benob 46 minutes ago
  I would say:
  - decomposition: discover a more general form of Fourrier transform to untangle the underlying factors
  - memorization: some patterns are recurrent in many domains such as power low
  - multitask: exploit cross-domain connections such as weather vs electricity
dash2 22 minutes ago
So the time series are provided with no context? It's just trained on lots of sets of numbers? Then you give it a new set of numbers and it guesses the rest, again with no context?
My guess as to how this would work: the machine will first guess from the data alone if this is one of the categories it has already seen/inferred (share prices, google trend cat searches etc.) Then it'll output a plausible completion for the category.
That doesn't seem as if it will work well for any categories outside the training data. I would rather just use either a simple model (ARIMA or whatever) or a theoretically-informed model. But what do I know.
EmilStenstrom 1 hour ago
Here is the link to the blogpost, that actually describe what this is: https://github.com/google-research/timesfm?tab=readme-ov-fil...
[-]
- nels 53 minutes ago
  I think you meant to link this page: https://research.google/blog/a-decoder-only-foundation-model...
- refulgentis 56 minutes ago
  That takes me to the same content as the submission, a GitHub repo (Chrome on iOS)
  [-]
  - rockwotj 53 minutes ago
    Probably the better link: https://research.google/blog/a-decoder-only-foundation-model...
    [-]
    - akshayshah 52 minutes ago
      And https://arxiv.org/pdf/2310.10688 if you want the full paper.
  - Cyuonut 52 minutes ago
    I suppose they tried to link this: https://research.google/blog/a-decoder-only-foundation-model...
wiradikusuma 42 minutes ago
Also: https://github.com/Nixtla/nixtla and https://facebook.github.io/prophet/
ra 23 minutes ago
This has been around a few months now, has anyone built anything on it?
Foobar8568 1 hour ago
Somehow I missed that one. Are there any competition on this?
I always had difficulties with ML and time series, I'll need to try that out.
[-]
- rockwotj 49 minutes ago
  https://www.datadoghq.com/blog/datadog-time-series-foundatio...
  https://moment-timeseries-foundation-model.github.io/
  https://arxiv.org/abs/2403.07815
  A friend at work used one to predict when our CEO would post in Slack, which is verry entertaining to see if correct.
jdthedisciple 23 minutes ago
Let me be blunt: Shannon would tell us that time forecasting is bullshit:
There is infinitely more entropy in the real world out there than any model can even remotely capture.
The world is not minecraft.
[-]
- mikkom 11 minutes ago
  Yeah all weather forecasts are just magic