Journal of Lanzhou University of Technology ›› 2021, Vol. 47 ›› Issue (5): 76-84.

• Automation Technique and Computer Technology • Previous Articles     Next Articles

Gaze prediction algorithm based on hypercomplex wavelet convolutional network

LI Ce, ZHU Zi-zhong, XU Da-you, GAO Wei-zhe, JIN Shan-gang   

  1. College of Electrical and Information Engineering, Lanzhou Univ. of Tech., Lanzhou 730050, China
  • Received:2020-01-10 Online:2021-10-28 Published:2021-11-18

Abstract: Gaze based prediction algorithms has a wide range of applications in object recognition, video compression, object tracking and so on. For existing gaze prediction models, the accuracy of gaze prediction is low due to the lack of feature details, single scale, and serious background information interference. This paper proposes a gaze prediction algorithm based on hypercomplex wavelet convolutional network. Firstly, aiming at the problem of loss of detailed features, the hypercomplex wavelet transform is used to extract the detailed features of the image in the frequency domain and fused with the spatial features extracted by the convolutional network. Then, through the atrous spatial pyramid pooling module, the feature maps obtained from different receptive fields are fused to effectively solve the problem of single feature scale. Finally, the proposed algorithm introduces a residual convolutional attention module, which combines spatial and channel attention mechanisms to effectively suppress the interference of background information and improve the accuracy of gaze prediction. On the SALICON datasets, CC, sAUC and SIM evaluation metrics, the performance of the proposed algorithm reaches 0.884 7, 0.769 3 and 0.778 0. On the CAT2000 datasets, the performance of the proposed algorithm is 0.735 5, 0.870 1, and 0.664 5. The experimental results show that the proposed algorithm has a good ability to predict fixation points.

Key words: gaze prediction, hypercomplex wavelet transform, spatial features, convolutional neural network

CLC Number: