fix: bug causes incorrect lookups into seasonality vector #74
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The seasonality vector is being indexed in an incorrect fashion in the Predict function, whether or not the training data ended at the full completion of a full season's period, we predict as if we were at the beginning of the season.
Consider the following case where the code behaves correctly:
trainLen = 70
period = 7 (i.e., len of training window)
When we predict on the first day after the train window, day 71, we index the seasonals array as follows:
seasonals[(m-1)%period]
... where m=1, thus we index seasonals[0], or the first day of the week, and all is as it should be.
But if the training data length is not evenly divisible by the seasonality's period, we still start our predictions at seasonals[0]. For instance if we have 71 days of training, using the same data, and we predict on day 72, we will index the training data with:
seasonals[(m-1)%period]
... where m=1, thus we index seasonals[0], or the first day of the week, but this is not correct. We should be looking at the second day of the week. In fact, no matter how long the training period was, Predict will always start the training period at the first element of the season, even if the training data cuts off in the middle of a season.
The solution is to add the length of the training period to the modulo calculation so that we index into the correct point in the season.