Skip to content

fix: bug causes incorrect lookups into seasonality vector #74

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

kevinroundy
Copy link

@kevinroundy kevinroundy commented Apr 23, 2025

The seasonality vector is being indexed in an incorrect fashion in the Predict function, whether or not the training data ended at the full completion of a full season's period, we predict as if we were at the beginning of the season.

Consider the following case where the code behaves correctly:

trainLen = 70
period = 7 (i.e., len of training window)

When we predict on the first day after the train window, day 71, we index the seasonals array as follows:
seasonals[(m-1)%period]
... where m=1, thus we index seasonals[0], or the first day of the week, and all is as it should be.

But if the training data length is not evenly divisible by the seasonality's period, we still start our predictions at seasonals[0]. For instance if we have 71 days of training, using the same data, and we predict on day 72, we will index the training data with:
seasonals[(m-1)%period]
... where m=1, thus we index seasonals[0], or the first day of the week, but this is not correct. We should be looking at the second day of the week. In fact, no matter how long the training period was, Predict will always start the training period at the first element of the season, even if the training data cuts off in the middle of a season.

The solution is to add the length of the training period to the modulo calculation so that we index into the correct point in the season.

The seasonality vector is being indexed in an incorrect fashion in the Predict function, whether or not the training data ended at the full completion of a full season's period, we predict as if we were at the beginning of the season. 

Consider the following case where the code behaves correctly: 

trainLen = 70
period = 7 (i.e., len of training window)

When we predict on the first day after the train window, day 71, we index the seasonals array as follows: 
    seasonals[(m-1)%period] 
... where m=1, thus we index seasonals[0], or the first day of the week, and all is as it should be. 

But if the training data length is not evenly divisible by the seasonality's period, we still start our predictions at seasonals[0]. For instance if we have 71 days of training, using the same data, and we predict on day 72, we will index the training data with: 
    seasonals[(m-1)%period] 
... where m=1, thus we index seasonals[0], or the first day of the week, but this is *not correct*. We should be looking at the second day of the week. In fact, no matter how long the training period was, Predict will always start the training period at the first element of the season, even if the training data cuts off in the middle of a season. 

The solution is to add the length of the training period to the modulo calculation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant