Energy demand forecasting using time series

Accurate demand forecasting is crucial for businesses to make informed decisions, optimize inventory levels, and improve customer satisfaction. By leveraging time series models like ARIMA and Prophet, businesses can not only predict demand but also identify trends and patterns to develop proactive strategies.

Arima

To begin with, the dataset is prepared by loading the data and replacing the index with the corresponding date. Then, the dataset is split into training and testing subsets.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import statsmodels.api as sm
import statsmodels.formula.api as smf
from prophet import Prophet
from sklearn.metrics import mean_squared_error
from statsmodels.tsa.arima.model import ARIMA
plt.style.use('ggplot')
plt.style.use('fast')

df_arima = pd.read_csv("PJME_hourly.csv")
df_arima.Datetime = pd.DatetimeIndex(df_arima.Datetime)
df_arima = df_arima.reset_index(drop=True)
df_arima.index = pd.PeriodIndex(df_arima.Datetime, freq='H')
df_arima = df_arima.drop(df_arima.columns[0], axis=1)
df_arima

	PJME_MW
Datetime
2002-12-31 01:00	26498.0
2002-12-31 02:00	25147.0
2002-12-31 03:00	24574.0
2002-12-31 04:00	24393.0
2002-12-31 05:00	24860.0
…	…
2018-01-01 20:00	44284.0
2018-01-01 21:00	43751.0
2018-01-01 22:00	42402.0
2018-01-01 23:00	40164.0
2018-01-02 00:00	38608.0

split_point = '1-Jan-2016'
train_arima = df_arima.loc[df_arima.index < split_point].copy()
test_arima= df_arima.loc[df_arima.index >= split_point].copy()

test_arima \
    .rename(columns={'PJME_MW': 'TEST SET'}) \
    .join(train_arima.rename(columns={'PJME_MW': 'TRAINING SET'}),
          how='outer') \
    .plot(figsize=(10, 5), title='Energy Consumption (MW)', style='.', ms=1)
plt.show()

train_test_arima

Now, the ARIMA model is fitted, defining the p,d, and q parameters:

model_arima = ARIMA(train_arima, order=(2,0,1))
model_arima_fit = model_arima.fit()
model_arima_fit.summary()

arima

The performance of the ARIMA model is evaluated on the testing set. The model’s predictive accuracy is then quantified using error metrics, resulting in a MAPE of 17%.

arima_forecast = model_arima_fit.predict(start=len(train_arima), end=len(df_arima)-1, dynamic=False)
arima_forecast = pd.DataFrame(arima_forecast)

# RMSE
np.sqrt(mean_squared_error(y_true=test_arima['PJME_MW'],
                   y_pred=arima_forecast['predicted_mean']))

6474.90208717094

#MAPE
def mape(y_true, y_pred): 
    y_true, y_pred = np.array(y_true), np.array(y_pred)
    return np.mean(np.abs((y_true - y_pred) / y_true)) * 100

mape(y_true=test_arima['PJME_MW'],
                   y_pred=arima_forecast['predicted_mean'])

17.475612965761343

PROPHET

To enhance our predictions, we leverage the time series prediction tool, Prophet, developed by Facebook. To use Prophet, we need to format our columns to comply with the required naming conventions.

df = pd.read_csv("PJME_hourly.csv", index_col=[0],
                  parse_dates=[0])
df_train = df.loc[df.index < split_point].copy()
df_test = df.loc[df.index >= split_point].copy()

train_prophet = df_train.reset_index() \
    .rename(columns={'Datetime':'ds', 'PJME_MW':'y'})
test_prophet = df_test.reset_index() \
    .rename(columns={'Datetime':'ds', 'PJME_MW':'y'})

The model is instantiated, trained, and applied to make predictions on the testing set:

model = Prophet()
model.fit(train_prophet)
prophet_forecast = model.predict(test_prophet)
prophet_forecast.loc[:,['ds','yhat_lower','yhat_upper','trend']]

	ds	yhat_lower	yhat_upper	yhat	trend
0	2016-01-01 00:00:00	25246.750720	34505.876842	30003.037051	31217.898672
1	2016-01-01 01:00:00	23586.638059	32278.979065	27986.257615	31217.867138
2	2016-01-01 02:00:00	21633.009369	31055.358356	26485.508878	31217.835604
3	2016-01-01 03:00:00	21036.939515	30026.875296	25613.179301	31217.804070
4	2016-01-01 04:00:00	21334.493617	30091.451316	25468.367381	31217.772536
…	…	…	…	…	…
22675	2018-08-02 20:00:00	23131.497863	62687.692651	41691.940934	30502.835532
22676	2018-08-02 21:00:00	21486.832495	61935.114453	40912.203071	30502.803998
22677	2018-08-02 22:00:00	20676.075026	61192.507108	39282.961743	30502.772464
22678	2018-08-02 23:00:00	17708.112800	58683.494409	37083.837882	30502.740930
22679	2018-08-03 00:00:00	15315.390736	57082.285941	34772.081648	30502.709396

Upon comparing the forecasts to the actual values, it can be observed that the model follows the behavior of the original time series. Furthermore, the components plot reveals the underlying patterns and trends for each of the main time units, including weekly, yearly, and hourly.

f, ax = plt.subplots(figsize=(15, 5))
ax.scatter(df_test.index, df_test['PJME_MW'], color='violet')
fig = model.plot(prophet_forecast, ax=ax)

prophet_prediction

fig = model.plot_components(prophet_forecast)
plt.show()

prophet_prediction

The evaluation of the Prophet model is performed using the RMSE and MAPE metrics. The model exhibits better performance compared to the ARIMA model, with a lower MAPE error of 15.8%. Indeed, a basic Prophet model is effective in predicting energy demand.

# RMSE
np.sqrt(mean_squared_error(y_true=df_test['PJME_MW'],
                   y_pred=prophet_forecast['yhat']))

6307.621116334434

#MAPE
mape(y_true=df_test['PJME_MW'],
                   y_pred=prophet_forecast['yhat'])

15.838310394639882

GitHub