Journal of Lanzhou University of Technology ›› 2021, Vol. 47 ›› Issue (1): 97-104.

• Automation Technique and Computer Technology • Previous Articles     Next Articles

Short-term load forecasting based on linear regression under MapReduce framework

WU Li-zhen, KONG Chun, CHEN Wei   

  1. College of Electrical and Information Engineering, Lanzhou Univ. of Tech., Lanzhou 730050, China
  • Received:2019-11-11 Online:2021-02-28 Published:2021-03-11

Abstract: In order to solve problems of slow calculation speed and low prediction accuracy caused by large amount of data and various kinds of data in load forecasting, a linear regression model based on small batch random gradient descent method is proposed in this paper under the framework of MapReduce parallel programming. First of all, in order to clean up repetitive data and bad data generated by the intelligent distribution terminal, the adaptive nearest neighbor sorting algorithm is proposed to remove the repeated data, and accordingly the K-means clustering method is used to eliminate abnormal data and incomplete data. The F-test method is then employed to test whether a data set can represent the load linearly. The T-test method is further adopted to test the significance of linear relationship between the characteristic vector and the load. Finally, any characteristic vector with weak linear relationship with the load is accordingly eliminated as a result. According to the above methods, a short-term load forecasting model is established and applied to the short-term load forecasting of distribution network in Wuwei, Gansu Province. Results from the prediction show that the average absolute percentage error of the proposed short-term load forecasting model is about 2.043%, and the root mean square error takes about 3 112.62. These forecasted errors meet the requirements of load forecasting, and not all improve greatly the speed of load calculation but also shorten the time for load forecasting.

Key words: big data analyses, mini-batch stochastic gradient descent, short-term load forecasting, distributed parallel computing, MapReduce framework

CLC Number: