Energy demand forecasting using time series

Accurate demand forecasting is crucial for businesses to make informed decisions, optimize inventory levels, and improve customer satisfaction. By leveraging time series models like ARIMA and Prophet, businesses can not only predict demand but also identify trends and patterns to develop proactive strategies.

Arima

To begin with, the dataset is prepared by loading the data and replacing the index with the corresponding date. Then, the dataset is split into training and testing subsets.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import statsmodels.api as sm
import statsmodels.formula.api as smf
from prophet import Prophet
from sklearn.metrics import mean_squared_error
from statsmodels.tsa.arima.model import ARIMA
plt.style.use('ggplot')
plt.style.use('fast')
df_arima = pd.read_csv("PJME_hourly.csv")
df_arima.Datetime = pd.DatetimeIndex(df_arima.Datetime)
df_arima = df_arima.reset_index(drop=True)
df_arima.index = pd.PeriodIndex(df_arima.Datetime, freq='H')
df_arima = df_arima.drop(df_arima.columns[0], axis=1)
df_arima
PJME_MW
Datetime
2002-12-31 01:00 26498.0
2002-12-31 02:00 25147.0
2002-12-31 03:00 24574.0
2002-12-31 04:00 24393.0
2002-12-31 05:00 24860.0
2018-01-01 20:00 44284.0
2018-01-01 21:00 43751.0
2018-01-01 22:00 42402.0
2018-01-01 23:00 40164.0
2018-01-02 00:00 38608.0
split_point = '1-Jan-2016'
train_arima = df_arima.loc[df_arima.index < split_point].copy()
test_arima= df_arima.loc[df_arima.index >= split_point].copy()

test_arima \
    .rename(columns={'PJME_MW': 'TEST SET'}) \
    .join(train_arima.rename(columns={'PJME_MW': 'TRAINING SET'}),
          how='outer') \
    .plot(figsize=(10, 5), title='Energy Consumption (MW)', style='.', ms=1)
plt.show()

train_test_arima

Now, the ARIMA model is fitted, defining the p,d, and q parameters:

model_arima = ARIMA(train_arima, order=(2,0,1))
model_arima_fit = model_arima.fit()
model_arima_fit.summary()

arima

The performance of the ARIMA model is evaluated on the testing set. The model’s predictive accuracy is then quantified using error metrics, resulting in a MAPE of 17%.

arima_forecast = model_arima_fit.predict(start=len(train_arima), end=len(df_arima)-1, dynamic=False)
arima_forecast = pd.DataFrame(arima_forecast)

# RMSE
np.sqrt(mean_squared_error(y_true=test_arima['PJME_MW'],
                   y_pred=arima_forecast['predicted_mean']))

6474.90208717094

#MAPE
def mape(y_true, y_pred): 
    y_true, y_pred = np.array(y_true), np.array(y_pred)
    return np.mean(np.abs((y_true - y_pred) / y_true)) * 100

mape(y_true=test_arima['PJME_MW'],
                   y_pred=arima_forecast['predicted_mean'])

17.475612965761343

PROPHET

To enhance our predictions, we leverage the time series prediction tool, Prophet, developed by Facebook. To use Prophet, we need to format our columns to comply with the required naming conventions.

df = pd.read_csv("PJME_hourly.csv", index_col=[0],
                  parse_dates=[0])
df_train = df.loc[df.index < split_point].copy()
df_test = df.loc[df.index >= split_point].copy()

train_prophet = df_train.reset_index() \
    .rename(columns={'Datetime':'ds', 'PJME_MW':'y'})
test_prophet = df_test.reset_index() \
    .rename(columns={'Datetime':'ds', 'PJME_MW':'y'})

The model is instantiated, trained, and applied to make predictions on the testing set:

model = Prophet()
model.fit(train_prophet)
prophet_forecast = model.predict(test_prophet)
prophet_forecast.loc[:,['ds','yhat_lower','yhat_upper','trend']]
ds yhat_lower yhat_upper yhat trend
0 2016-01-01 00:00:00 25246.750720 34505.876842 30003.037051 31217.898672
1 2016-01-01 01:00:00 23586.638059 32278.979065 27986.257615 31217.867138
2 2016-01-01 02:00:00 21633.009369 31055.358356 26485.508878 31217.835604
3 2016-01-01 03:00:00 21036.939515 30026.875296 25613.179301 31217.804070
4 2016-01-01 04:00:00 21334.493617 30091.451316 25468.367381 31217.772536
22675 2018-08-02 20:00:00 23131.497863 62687.692651 41691.940934 30502.835532
22676 2018-08-02 21:00:00 21486.832495 61935.114453 40912.203071 30502.803998
22677 2018-08-02 22:00:00 20676.075026 61192.507108 39282.961743 30502.772464
22678 2018-08-02 23:00:00 17708.112800 58683.494409 37083.837882 30502.740930
22679 2018-08-03 00:00:00 15315.390736 57082.285941 34772.081648 30502.709396

Upon comparing the forecasts to the actual values, it can be observed that the model follows the behavior of the original time series. Furthermore, the components plot reveals the underlying patterns and trends for each of the main time units, including weekly, yearly, and hourly.

f, ax = plt.subplots(figsize=(15, 5))
ax.scatter(df_test.index, df_test['PJME_MW'], color='violet')
fig = model.plot(prophet_forecast, ax=ax)

prophet_prediction

fig = model.plot_components(prophet_forecast)
plt.show()

prophet_prediction

The evaluation of the Prophet model is performed using the RMSE and MAPE metrics. The model exhibits better performance compared to the ARIMA model, with a lower MAPE error of 15.8%. Indeed, a basic Prophet model is effective in predicting energy demand.

# RMSE
np.sqrt(mean_squared_error(y_true=df_test['PJME_MW'],
                   y_pred=prophet_forecast['yhat']))

6307.621116334434

#MAPE
mape(y_true=df_test['PJME_MW'],
                   y_pred=prophet_forecast['yhat'])

15.838310394639882

GitHub