Abstract

Electricity price forecasting holds very important position in the electricity market. Inaccurate price forecasting may cause energy waste and management chaos in the electricity market. However, electricity price forecasting has always been regarded as one of the largest challenges in the electricity market because it shows high volatility, which makes electricity price forecasting difficult. This paper proposes the use of artificial intelligence optimization combination forecasting models based on preprocessing data, called “chaos particles optimization (CPSO) weight-determined combination models.” These models allow for the weight of the combined model to take values of . In the proposed models, the density-based spatial clustering of applications with noise (DBSCAN) algorithm is used to identify outliers, and the outliers are replaced by a new data-produced linear interpolation function. The proposed CPSO weight-determined combination models are then used to forecast the projected future electricity price. In this case study, the electricity price data of South Australia are simulated. The results indicate that, while the weight of the combined model takes values of , the proposed combination model can always provide adaptive, reliable, and comparatively accurate forecast results in comparison to traditional combination models.

1. Introduction

Electricity has been regarded as a unique commodity in the energy market under the reformation of the energy economy because it is hard to store and it is closely related to peoples’ livelihood. Electricity price plays an important role in the power market because it balances market demand and market supply [1]. The future price of electricity is the indicator of the market demand for electricity generators as well as for the performance of the profitability of the electricity market. In addition, the nonstorable nature of electricity requires maintaining a constant balance between demand and supply. Therefore, an accurate electricity price forecasting method plays an important role in the electricity industry because not only can it help the generators and decision-makers to generate more profit, but it can also help to reduce waste of the electricity. However, the price of electricity is affected by many factors, including the demand of the market, electricity net transport, weather, and many other uncertain factors [2]. Thus, the price is highly volatile, and the abnormal behaviors of jump points amass abnormal price volatility. The abnormal price points are also called price spikes, and these spikes cause the forecasting of the price of electricity to be a challenge.

The common models used to forecast the price of electricity are classified as time series models and artificial intelligence based models [3]. The time series models are some of the most widely used techniques, including the ARIMA and GARCH models, which are regarded as efficient ways to forecast the price of electricity [4]. Time series models are specifically developed for forecasting the price of electricity [2]. Artificial intelligence based models, including neural network and data-mining models, are regarded as the most popular techniques applied in electricity price forecasting [5]. A fuzzy inference system is also applied in electricity price forecasting, and the desired results have been obtained [6]. Francesco Serinaldi has introduced the application of generalized additive models for location, scale, and shape (GAMLSS), which is a dynamical distribution model that has proved to be a flexible model in electricity price forecasting [5].

However, the models listed above are all just based on the original data without preprocessing any outliers. Electricity prices can change suddenly and can make large jumps to extreme levels; this phenomenon may be caused by unexpected increases in demand, an unexpected supply shortage, or failures in the transmission net [7]. The extreme price values are also called “price spikes,” which negatively affect forecasting accuracies because many forecasting models are sensitive to extreme observations [5]. Thus, the detection of price spikes is a very important issue for practical situations. In this paper, the density-based spatial clustering of applications with noise (DBSCAN) algorithm is used to identify and remove the electricity price spikes. The linear interpolation function is used to produce new points that replace the price spikes for the development of the preprocessed data. Furthermore, the original data and the preprocessed data are used to establish the forecasting models, and then the forecasting results are compared in this study. The simulation results show that the forecasts built using the preprocessed data perform better than when the original data were used.

It is widely accepted that combination forecasts can achieve desirable results [8, 9]. In several research studies, combination forecasts have generally been shown to have outstanding performance in comparison to single forecasts [10]. Because the main idea of a combination forecast is assigning weights to the combined forecasts according to the contribution of the combined forecasts, it is an important task to select a reasonable combination method. Traditional combination methods have restrictions on weight: assigning negative values is not allowed and the sum of the weights must be 1. In this way, traditional combination methods have low forecasting accuracies. The CPSO weight-determined combination method allows weights of combined models to take values of , which overcomes the shortage of traditional combination models caused by the no negative constraint on weight. Furthermore, the CPSO weight-determined combination method has inherited several advantages of an artificial intelligence algorithm to flexibly search for the best solutions. Several combination models are compared, and their forecasting results are analyzed carefully in this paper. The results indicate that CPSO weight-determined combination models outperform traditional combination models.

This paper is organized as follows: Section 1 gives a brief introduction of this paper; Section 2 proposes theories of individual forecasts and combination methods; Section 3 illustrates the electricity price forecasting of South Australia (SA) and contains the analysis of the forecasting results and a comparison of the forecasting models. Lastly, Section 4 concludes with the contributions of this paper.

2. Materials and Methods

Three individual forecasting models including the autoregressive integrated moving average (ARIMA) model, the back propagation neural network (BPNN) model, and Kalman filtering (KF) model are chosen for this study. Each of these models has its own advantages in electricity price forecasting. The ARIMA model can analyze the linear relationship of variables and implement various exponential smoothing models [11]. The BPNN model is an efficient way to analyze nonlinear data [12], and the KF model provides an efficient computational solution using the least square method with only a minor computational cost and with easy adaptation to any alteration of the observations [13]. However, the ARIAM model is not always efficient when applied to any data set [11]; the global solution determined using the BPNN model depends on the training of its nonlinear functions, which may cause inaccurate prediction results [14]; and the KF models cannot always produce desirable forecasting results [15]. As a widely accepted way to improve forecasting accuracies, combination methods are proposed in this paper to combine the individual forecasts.

2.1. Autoregressive Integrated Moving Average (ARIMA) Model

The ARIMA model was developed by Box and Jenkins, also called “Box-Jenkins,” and has been widely used in many areas in the last several decades [11]. This model can capture the linear relationship between different variables in the real world [16]. An ARIMA ( ) model can be expressed as follows: where ( ) is the time series value and and are the coefficients related to the lag operator or back operator . In addition, ,   ,   , and are inter numbers often referred to as autoregressive and moving average polynomials, respectively. Finally, are the random errors at time , which are assumed to be independently and identically distributed with mean = 0 and a variance of . In an ARIMA model, the future time series value is regarded to be a linear function of several past observations and random errors [11]. The form of (1) can be written as (2), and then it is easy to obtain the future series according to the several past values: To obtain accurate forecasting results, three phases must be performed [16]: (1)identify the structure of the model;(2)estimate the parameters of the model;(3)establish the model and forecast the future series.

2.2. Back Propagation Neural Network (BPNN) Model

An artificial neural network system is a nonlinear model composed of three layers, including an input layer, one or more hidden layers, and one output layer. There is a large amount of nodes to connect the three or more layers using different weights. The BPNN model is a type of multilayer feed-forward neural network with a wide variety of applications, which is one of the artificial intelligence types of models. Input vectors and the corresponding reference vectors are used to train a network until it can approximate a function and associate input vectors with specific output vectors [12]. In this paper, the Hecht-Nelson method is used to determine the number of hidden layer nodes [13], which means that when the number of input layer nodes is , the number of hidden layer nodes is chosen as . The structure of BPNN is shown in Figure 1 and consists of three layers: the input layer, the hidden layer, and the output layer.

Before training the net, it is necessary to process data. To ensure that the input vectors are compatible when there are significant differences in their magnitudes, we determine the input vector by normalizing each input value as follows [14, 17]: where  and are the maximum and the minimum values of the input data, respectively. is the real value of each vector. Subsequent steps for training a BP net model for application in electricity price forecasting are necessary. Figure 1 shows the topology of the BP network.

The net is initialized by giving weights , , and threshold value randomly. The input set is then preprocessed by (3), and the output of hidden layer is calculated by the hidden layer function: where is the output of the hidden layer node and represents the activation function of a node, which is usually a sigmoid function.

The outputs of the output layer data are calculated with the following form: where represents the bias of the neuron, output1 represents the output data of the network, and stands for the activation function of the output layer node [14].

2.3. Kalman Filtering (KF) Model

Kalman filters (KF) were proposed by Kalman in 1960. These filters are dynamic behavioral procedure systems for optimal time series estimation. Observations are recursively combined with recent forecasting values with weights that minimize the corresponding biases [15]. The discrete procedure used is stated by the following state equation (6) and output equation (7): where stands for the time, is the actual state at time , and and are the coefficient matrices that must be determined before the model is applied in the forecasting system. represents the output variable vector. and are the process series and measurement noise series, respectively, and they are assumed to be white Gaussian noise [18].

The KF model gives a dynamic estimation of the state based on the observation value along with time . The present estimate of   is based on the previous value , , and the relationship can be given as As the new observation value is known, the estimate of state is updated to be where The final estimate of is Equations (8)–(12) provide a detailed updating procedure for the Kalman filter algorithm where is the forecasting gain and is the forecasting covariance. The ARIMA ( ) model is used to estimate in this paper, and the observation equation can be denoted as [15]: and the observation matrix .

2.4. Traditional Combination Method

In general, the forecasting output of a combination forecasting model based on component forecasts that produces forecasts of is given by the following form: where is a function and is a weight parameter vector [8]. In this study, two methods are chosen to determine the parameter vector .

Let denote the forecasting output of the th individual forecasting model at time point     for time series . The combination forecasting output at time     can then be denoted in the following form: where is the combination model’s forecasting output, is the number of the component forecasting models, and is the weight of the th individual forecasting model. Then, the constraints for must be considered. In conventional research studies, the constraints must meet the following requirement to ensure that the combined model is reasonable [19]: To estimate the performance of the combined models, the key issue is to adjust the weight of the component forecasts. There are many methods for obtaining reasonable weight values, and the most commonly used method is the weighted average (WA) method. Let denote the residual of the th individual forecasting model at time   . The residual of the combined model is denoted as follows: Therefore, the WA minimizes the mean absolute percentage error (MAPE) of the output of the combination forecasts to obtain the optimal weights as follows:

2.5. The Proposed CPSO Weight-Determined Combination Method

Particle swarm optimization (PSO), developed by Kennedy and Eberhart, is a new intelligence algorithm based on a swarm intelligence algorithm inspired by the cooperation and communication of a swarm of birds or fish looking for food [2022].

2.5.1. CPSO Algorithm

Chaos is a common phenomenon that consists of unstable dynamic behavior that is sensitive to initial conditions. However, it contains an inherent regularity and randomness whereby the ergodicity and regularity of chaotic variables can be used to optimize a search. Recently, the chaos series replaced the random series for the development of the chaos optimization algorithm (COA), which has achieved good results [23]. A logistic equation is a chaotic system, and the chaotic sequence is expressed as follows: where  , is the control parameter. is the chaotic sequence at point   .

The chaos particle swarm optimization (CPSO) algorithm is developed based on the COA, which replaces random series with chaos series to avoid premature convergence [24]. The CPSO algorithm integrates the fast computational time of the PSO algorithm and the strong ability of the COA to examine values beyond the local extrema. In addition, CPSO can avoid the shortcomings of the PSO algorithm because it easily arrives at the local extrema by maintaining the diversity of the swarm [25]. The speed of the particles is , which varies using the following form: The position of each particle is changing according to the following equation: where is the time, and are acceleration coefficients, is the inertia factor, and are two independent random numbers uniformly distributed in the range of , and indicates the number of particles. Many studies have demonstrated that better results can be achieved when and when changes from 0.4 to 0.9 [26]. Table 1 shows the advantage of using CPSO compared to using basic PSO. Generally, the logistic chaos series performs better than the random series under Sphere, Rosenbrock, Rastrigrin and Schaffer f6 test functions.

2.5.2. Selecting Weights of Combination Models Using the CPSO Algorithm

In this paper, the CPSO algorithm is used to determine the weights of the combination models. The combination model is shown in (1), and the objective function is shown as (5). However, to obtain better results, the constraints must be changed to the following: Weights are regarded as particles and are optimized by the CPSO algorithm; the best particles have optimal weights. The optimal processing is expressed using the following steps [27].

Step 1. Initialize the parameters containing the particle population , acceleration coefficients and , inertia factor , and the iteration times.

Step 2. Let the particles fly with the speed and location of particles designated under the control of (7) and (8), in which the random series is replaced by the chaos series.

Step 3. Denote the best particle as the th particle and map into the interval with the form ,   to generate the chaos series , , and then return the chaos series   to the original solution space using the form , , to obtain .

Step 4. Calculate the fitness value of the current best particle that is the MAPE in this study, and then find the best individual .

Step 5. Use the best individual to replace a current particle.

Step 6. Check whether the optimal solution or maximum number of iterations has been achieved. If not, return to Step 2, or break.

3. A Case Study

The electricity price data used in this study is from South Australia (SA) (see Figure 2), which is located in the south of Australia [28].

3.1. Data Description

The available data were collected every 30 minutes from 1 March to 21 March 2009, and the collection time began at 00:30 and ended at 12:00 each day. The collected data are grouped into two sets: the “training set” was used to build the forecasting models, and the other set was the test set. The collected data are shown in Figure 3, which shows the high volatility of the data. In other words, there are many price spikes in the data.

3.2. Data Processing

Inaccurate forecasting results will arise if the training set used to build the forecasting model includes price spikes. Therefore, in this paper, several steps are taken to revise this model. The density-based spatial clustering of applications with noise (DBSCAN) algorithm is used in this study to identify the extreme data, and the linear interpolation function is used to generate normal values to replace the price spikes in the training set.

3.2.1. Density-Based Spatial Clustering of Applications with Noise (DBSCAN) Algorithm

The DBSCAN algorithm was proposed in 1996 and is a simple and efficient clustering algorithm based on density clustering [29]. The DBSCAN algorithm regards the cluster as a region and assumes that the objects in the region are dense. DBSCAN is the best choice for identifying price spikes, for it is able to discover clusters of arbitrary shapes [30]. The DBSCAN algorithm includes two parameters, and . is the radius of a neighborhood, is the number of points, and is the given data. The following concepts are introduced for understanding and implementing the process of the algorithm. Figure 4 shows a brief scan of the DBSCAN algorithm.

Definition 1 (core object). The -neighborhood of a point , containing at least points, denoted by , namely, which should satisfy .

Definition 2 (directly density-reachable). Two conditions should be satisfied where an object in the data is directly density-reachable from an object ,   .

Definition 3 (density-connected). Objects and are density-connected in the data set if for the given and ; and are all directly density-reachable from the same object .

Definition 4 (cluster). Clusters are the nonempty subs of the data set that are density-connected from all core points.

Definition 5 (noise). Suppose that are all clusters of data set and noise is the set of points in not in any cluster. Thus, noise is treated as a set of points that satisfy the form .

3.2.2. Interpolation

In Section 2.2, the DBSCAN algorithm was used to identify the anomaly data. However, elimination of singular points can cause missing data, which is troublesome when the preprocessed data are used to build the forecasting models. In this study, a linear interpolation function is used to generate data to fill the holes left by the eliminated data. The interp1 function in the toolbox of MATLAB 2013a is applied in interpolation.

The preprocessed data are shown in Figure 5.

3.3. Statistic Measurements of Forecasting Performance

The forecasting performance of the individual forecasts and combination models is evaluated by three common criteria: the mean absolute percentage error (MAPE), the root mean square error (RMSE), and the mean absolute percentage error (MAE). The accuracy of the forecasting results increase as these three error values decrease [31]: where is the actual value at time , is the forecasting value at time, and in this study.

3.4. Simulation and Individual Models Performance

The electricity price data of SA are preprocessed and divided into two sets: one set contains the electricity price data from 1 March 2009 to 20 March 2009, which is used to build the forecasting model, and the remaining data are included in the other set, which is the out-of-sample data. The detailed process of the model building is shown in Figure 6, and three individual models are chosen for the simulation. The forecasting result is shown in Figure 7, which suggests the forecasting model is only accurate in the local area. Table 2 provides a more detailed description of Figure 7 in which the forecasting results of the three individual models are listed. Three criteria are chosen to evaluate performance of the three individual forecasts, and the results are shown in Table 3. The data in Table 3 indicates that the KF model is the best model of the three forecasting models because it produces the minimum amount of error indicators in MAPE, RMSE, and MAE. The ARIMA model performs undesirably upon comparison with other two models.

3.5. Simulations and the Evaluation of the Combination Model’s Performance

The proposed combination method combines three individual forecasts. As a comparison, the traditional combination method is adopted to combine individual forecasts as well. Finally, the forecasting results are compared in this section. The simulation process can be seen in Figure 8.

Step 1. Combination simulation: CPSO weight-determined combination method combines the three individual forecasts. As a comparison, the three individual models are also combined using the traditional combination method.

Step 2. Analysis of the results: the performance of the combination models based on two combination methods is shown in Tables 4 and 5, respectively. A-B represents the combination model of ARIMA and BP; B-K represents the combination models of BP and Kalman; A-K represents the combination model of ARIMA and KF; A-B-K represents the combination model of ARIMA, BP, and KF; means the traditional combination method; and means the CPSO weight-determined combination method. Table 4 shows the forecasting results of the CPSO weight-determined combination models in the forecast period from 00:30 to 12:00 on 21 March and Table 5 shows the forecasting results of the traditional models in the same forecasting period, where the forecasting performance in every step can be observed easily. In comparing the forecasting results shown in Tables 4 and 5, a piece of information is revealed: while the forecasting points are moving, the proposed CPSO weight-determined combination models always perform better than the traditional combination models. For the CPSO weight-determined combination models, the A-B, B-K, A-K, and A-B-K models have a minimum MAPE value of 0.00% at time points 04:30, 05:00, 08:30, and 05:30, respectively. However, for the traditional combination method based A-B, A-K, B-K, and A-B-K models, the MAPE values were 6.33%, 3.37%, 3.73%, and 2.89%, respectively. This study illustrates the outstanding performance of the CPSO weight-determined combination models. The average evaluation criteria listed in Table 6 suggest that the A-B model based on the traditional combination method, which produced MAPE, RMSE, and MAE values of 25.19%, 9.74, and 7.72, respectively, is higher than the corresponding values of A-B model based on the CPSO weight-determined combination method. The same analyses were conducted using the B-K, A-K, and A-B-K models, and these results indicate that the combination models based on the CPSO weight-determined combination method perform better than the traditional combination method overall because CPSO algorithm searches the weights of combination models by the way of artificial intelligence.
Table 6 lists the comparison of the individual and combination models. On the whole, the combination models outperform the best combined individual models because these combination models generate smaller values of MAPE, RMSE, and MAE. The MAPE values of the A-B-K model are 21.12% ( ) and 20.79% ( ), the RMSE values of the A-B-K model are 8.19 ( ) and 7.93 ( ), and the MAE values are 6.32 ( ) and 6.12 ( ), which are all smaller than the evaluation criteria of A-B-K, A-B, B-K, and A-K models. Therefore, A-B-K models perform better than A-B, B-K, and A-K models no matter if these other models are combined by the traditional combination method or the CPSO weight-determined combination method.
Table 7 shows the detailed weight assignment of the combination models and compares the two different combination methods. The comparison results indicate that the CPSO weight-determined combination method assigns weights to the individual models flexibly without the constraint of the sum of weights being 1 and with no negative assignments, which makes it easy to obtain more accurate forecasting results. The results in Table 7 suggest that, when the combination models perform undesirably, the traditional combination method would assign the weight of the best combined model a value of 1 and the others a value of 0. However, the CPSO weight-determined combination method would adjust the assignment of the weights without the constraint of the sum of weights being 1 with no negative assignments to find the best combination. The MAPE values indicate that the performance of the B-K model based on the traditional combination method is undesirable. Therefore, it is reasonable to improve the combination rule using the CPSO weight-determined combination method.

3.6. Comparison of the Forecasting Results of the Preprocessed Data and the Original Data

It is well known that the preprocessing of anomaly data is very useful in the building of forecasting models. In this section, the forecasting results of the forecasting models built by the preprocessed data (PF) and the forecasting models built by the original data (ODF) are compared in Table 8.

Table 8 shows that the PF based models have the best performance compared to the ODF models for both individual models and combination models. For the ARIMA models, the PF based model has a MAPE value of 20.05%, which is lower than the MAPE value of the ODF based model at 37.01%. The BP and KF models based on preprocessed data are superior to the models based on the original data. Although the combination models based on the original data (CODF) have good performance in comparing individual (IODF) models, the CODF models are inferior to the combination models based on the preprocessed data (CPF).

4. Conclusion

As electricity forecasting becomes increasingly important, this paper proposes the use of the CPSO weight-determined combination method based on DBSCAN algorithm preprocessed data to improve electricity price forecasting. The traditional combination method has a no-negative weight constraint, and its sum must be 1, which limits the range of weights and thus causes lower forecasting accuracies. In addition, original data-based forecasts do not consider the fact that electricity price spikes are sensitive to outliers, which leads to inaccurate forecasting. To overcome the limitations of original data based traditional combination models, the proposed CPSO weight-determined combination method allows the weights of combined models to take values of , and the original data are preprocessed by the DBSCAN algorithm. The electricity price data of South Australia have been simulated by the proposed, preprocessed, and data based CPSO weight-determined combination models to forecast the electricity price for the period from 00:30 to 12:00 on 21 March 2009, in SA. The preprocessed data-based CPSO weight-determined combination models were observed to have optimal performance in comparison to the traditional combination method. This conclusion was based on the original data in terms of three metrics: MAPE, RMSE, and MAE.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This work was supported by the National Natural Science Foundation of China (Grant no. 71171102).