About to be finished. You may ignore this post. (Uploaded online for page structuring purposes.)

Intro

In the first post we prepared the data.

In the second post we created trajectories with varying windowsizes. Then for different distance functions, we obtained the nearest neighbors.

Now we will use these nearest neighbors for forecasting. There are endless options to make forecasts with nearest neighbors. Some of them are

  1. Taking mean of closest k neighbors

  2. Taking weighted mean of closest k neighbors. Weighting can be determined - according to their order of closeness - according to their distance to the trajectory of interest

  3. Creating a linear regression model - Linear regression would be too general (as we will show), thus running the regression models on subsets is also another option.

Before making forecasts, we first need to set benchmark results.

Benchmark Results

Naive

The most straightforward forecast would be using the previous observation as the forecast. This also can be called as random walk because we assume every new step is just a noise different from the previous one.

Autoregressive Models

Autoregressive models always come to mind when analysing time-series forecasting. Thus, we fitted ARIMA and AR models to create a nice performance benchmark. Going into details is beyond the scope of this post, we only show some of the results.

Results

Benchmark Results (MAE / MAPE)
Model Station 1 Station 2 Station 3
Naive 1 1 1
AR (9) 1 1 1
ARIMA 1 1 1
Simpler ARIMA 1 1 1

note to self: kable kullanip bu sekil tablo cikariliyor direkt. kable(df) diye.

ARIMA naiveden kotu falan cikiyor trainde de. sacma seyler var. ona sonra bakarim.

Historical Similarity Forecasting

Mean

Analysing Results

Multi-step Results

We can see that in multi-step ahead forecasting, historical similarity methods outperform benchmark results by a lot!