There is no specific answer for how much data is needed to build and train a machine learning model, but we've provided some suggestions which may prevent you from making some common mistakes.
Before jumping into building time series machine learning models, ask yourself a few questions.
What's the granularity of my data, e.g., seconds, hours, years?
A year’s worth of data can imply 365 data points, 52 data points, 12 data points, or even a single data point depending on how the data was recorded, and all are equally valid.
What are my underlying assumptions about my data?
If you expect your data is annually seasonal, make sure you have at least 365 days, 52 weeks, or 12 months of data, plus some additional data points for testing- note how important the granularity of data is in this scenario.
How far out am I trying to predict?
If you’re trying to predict 12 months into the future, you should have at least 12 months worth (a data point for every month) to train on before you can expect to have trustworthy results.
A decent model should always have more observations than parameters. This means that the submitted data should have as many observations as the period of the maximum expected seasonality.
If you have daily sales data and you expect that it exhibits annual seasonality, you should have more than 365 (days in a year) data points to train a successful model. If you have hourly data and you expect your data exhibits weekly seasonality, you should have more than 7 (days in a week) multiplied by 24 (hours in a day) = 168 observations to train a model.
These are the bare minimum number of points needed to train time series models, but if you want to test how accurately your model performs, more data is required. Read Training Set vs. Test Set to learn what is needed to test a model.
Our Data Science team suggests a general rule of thumb is that the number of observations should be proportional to 1/d^p where p = # of features and d = the maximum spacing between consecutive or neighboring data points after each feature is scaled to the range 0-1. Another general rule of thumb is that for a data set with N observations, you can have up to N-1 features if the features are uncorrelated. If the features are strongly correlated, you should have no more than √N features.
For the non-data scientists of the world, the more practical suggestion is to have 10x as many observations as you do features.
A data set with 3 columns (features) should contain approximately 30 total rows (observations) + 1 additional column (target) in which you are trying to predict.