fix: bug causes incorrect lookups into seasonality vector #74

kevinroundy · 2025-04-23T20:38:11Z

The seasonality vector is being indexed in an incorrect fashion in the Predict function, whether or not the training data ended at the full completion of a full season's period, we predict as if we were at the beginning of the season.

Consider the following case where the code behaves correctly:

trainLen = 70
period = 7 (i.e., len of training window)

When we predict on the first day after the train window, day 71, we index the seasonals array as follows:
seasonals[(m-1)%period]
... where m=1, thus we index seasonals[0], or the first day of the week, and all is as it should be.

But if the training data length is not evenly divisible by the seasonality's period, we still start our predictions at seasonals[0]. For instance if we have 71 days of training, using the same data, and we predict on day 72, we will index the training data with:
seasonals[(m-1)%period]
... where m=1, thus we index seasonals[0], or the first day of the week, but this is not correct. We should be looking at the second day of the week. In fact, no matter how long the training period was, Predict will always start the training period at the first element of the season, even if the training data cuts off in the middle of a season.

The solution is to add the length of the training period to the modulo calculation so that we index into the correct point in the season.

The seasonality vector is being indexed in an incorrect fashion in the Predict function, whether or not the training data ended at the full completion of a full season's period, we predict as if we were at the beginning of the season. Consider the following case where the code behaves correctly: trainLen = 70 period = 7 (i.e., len of training window) When we predict on the first day after the train window, day 71, we index the seasonals array as follows: seasonals[(m-1)%period] ... where m=1, thus we index seasonals[0], or the first day of the week, and all is as it should be. But if the training data length is not evenly divisible by the seasonality's period, we still start our predictions at seasonals[0]. For instance if we have 71 days of training, using the same data, and we predict on day 72, we will index the training data with: seasonals[(m-1)%period] ... where m=1, thus we index seasonals[0], or the first day of the week, but this is *not correct*. We should be looking at the second day of the week. In fact, no matter how long the training period was, Predict will always start the training period at the first element of the season, even if the training data cuts off in the middle of a season. The solution is to add the length of the training period to the modulo calculation.

kevinroundy mentioned this pull request Apr 23, 2025

Please review PR's to fix Holt Winters bugs #75

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: bug causes incorrect lookups into seasonality vector #74

fix: bug causes incorrect lookups into seasonality vector #74

kevinroundy commented Apr 23, 2025 •

edited

Loading

fix: bug causes incorrect lookups into seasonality vector #74

Are you sure you want to change the base?

fix: bug causes incorrect lookups into seasonality vector #74

Conversation

kevinroundy commented Apr 23, 2025 • edited Loading

kevinroundy commented Apr 23, 2025 •

edited

Loading