Journal of Lanzhou University of Technology ›› 2023, Vol. 49 ›› Issue (5): 93-101.

• Automation Technique and Computer Technology • Previous Articles     Next Articles

Feature selection based on an improved Harris hawk optimization algorithm

ZHAO Xiao-qiang1,2,3, QIANG Rui-ru1   

  1. 1. School of Electrical Engineering and Information Engineering, Lanzhou Univ. of Tech., Lanzhou 730050, China;
    2. Key Laboratory of Advanced Control of Industrial Processes of Gansu Province, Lanzhou Univ. of Tech., Lanzhou 730050, China;
    3. National Electrical and Control Engineering Experimental Teaching Center, Lanzhou Univ. of Tech., Lanzhou 730050, China
  • Received:2021-11-23 Online:2023-10-28 Published:2023-11-07

Abstract: Feature selection is a machine learning task that aims to reduce the number of features by removing irrelevant and redundant data while maintaining high classification accuracy. In order to address the problems that Harris hawk optimization algorithm (HHO) cannot perform feature selection in the discrete feature space, and that the population diversity is reduced and is easy to fall into local optimality in the later stage of the algorithm, a feature selection algorithm based on an improved Harris Hawk is proposed here. First, chaotic mapping is used to diversify the initial population to ensure that it can be evenly distributed in the search space under the premise of better population quality. Secondly, the position of the rabbit is re-updated by introducing a Gaussian mutation operator to avoid the algorithm falling into the local maximum. Finally, the binary version of the secondary optimization algorithm is designed and applied to the wrapped feature selection problem based on the KNN classifier. Through feature selection simulation experiments on 18 classic UCI data sets, the results show that the proposed algorithm in this paper can obtain better results than other mainstream algorithms in terms of fitness value, average classification accuracy and average feature selection number. So the proposed algorithm in this paper can effectively extract feature subsets and obtain more accurate data classification, and can achieve higher optimization accuracy.

Key words: wrapped feature selection, Harris hawk optimization algorithm, chaotic mapping, Gaussian mutation

CLC Number: