兰州理工大学学报 ›› 2020, Vol. 46 ›› Issue (6): 104-111.

• 自动化技术与计算机技术 • 上一篇    下一篇

由粗到精和特征筛选的精确回归预测方法及其在二语习得中的应用

林玉萍1, 龙红2, 宋盼盼2, 李小棉1   

  1. 1.西安交通大学 外国语学院, 陕西 西安 710049;
    2.西安交通大学 软件学院, 陕西 西安 710049
  • 收稿日期:2020-07-14 出版日期:2020-12-28 发布日期:2021-01-07
  • 作者简介:林玉萍(1977-),女,黑龙江勃利县人,副教授
  • 基金资助:
    陕西省教育科学“十三五”规划2017年度课题(SGH17H003)

Precise regression prediction method based on coarse-to-fine and feature selection and its application in second language acquisition

LIN Yu-ping1, LONG Hong2, SONG Pan-pan2, LI Xiao-mian1   

  1. 1. School of Foreign Studies, Xi'an Jiaotong University, Xi'an 710049, China;
    2. School of Software Engineering, Xi'an Jiaotong University, Xi'an 710049, China
  • Received:2020-07-14 Online:2020-12-28 Published:2021-01-07

摘要: 针对数据分布不均匀且因素多而容易造成预测不精确的问题,提出一种结合由粗到精与特征筛选的精确回归预测方法.首先,由于数据分布不均匀且预测区间大,直接预测难以精确地拟合,提出一种由粗到精的预测方法,并使用决策树进行粗分类,预测目标所在的子区间,然后在子区间内实现精确的回归预测.其次,如果数据量少且特征因素多会引起过拟合,而且部分冗余特征会影响模型的预测精度,因此,提出一种基于特征筛选的回归预测方法以提高预测精度.在大学生的英语成绩与其人格因素数据集上进行相关实验,结果证明了由粗到精和特征筛选方法与传统回归模型相比精度更高且稳定性更好.通过提出的人格因素与英语成绩回归预测模型,可以制定合理的培养方案弥补学生人格因素中的短板,提升学生的自身竞争能力,从而更好地推动中国的英语教育.

关键词: 决策树, 由粗到精, 特征筛选, 回归预测方法

Abstract: To deal with the problem of uneven data distribution and the situation that many factors easily cause inaccurate prediction, this paper proposes an accurate regression prediction method combining coarse to fine and feature selection. First of all, this paper puts forward a prediction method from coarse-to-fine, which uses decision trees to roughly classify the sub-intervals of the prediction target for overcoming the problem that direct prediction may be difficult to accurately fit due to uneven data distribution and large prediction intervals. Next, taking into account of the small amount of data and the circumstance that many feature factors may cause over-fitting and some redundant features may affect the prediction accuracy of the model, this paper proposes a regression prediction method based on feature selection to improve the prediction accuracy. Compared with the traditional regression model, the experimental results on the data set of English achievement of college students and their personality factors have proved that the method from coarse-to-fine and feature selection proposed in this paper has higher precision and better stability. More flexible and feasible syllabus based on proposed method could be laid out to tackle students’ personality traits problem to enhance their competition capability and thus impel English education in China in the future.

Key words: decision trees, coarse-to-fine, feature selection, regression prediction method

中图分类号: