兰州理工大学学报 ›› 2022, Vol. 48 ›› Issue (6): 96-103.

• 自动化技术与计算机技术 • 上一篇    下一篇

基于网络拓扑和多种生物信息融合的   关键蛋白质识别算法

卢鹏丽*, 陈云天   

  1. 兰州理工大学 计算机与通信学院, 甘肃 兰州 730050
  • 收稿日期:2021-05-21 出版日期:2022-12-28 发布日期:2023-03-21
  • 通讯作者: 卢鹏丽(1973-),女,甘肃酒泉人,博士,教授,博导.Email:lupengli88@163.com
  • 基金资助:
    国家自然科学基金(11861045,11361033)

Essential protein identification algorithm based on the combination of network topology and multiple biological information

LU Peng-li, CHEN Yun-tian   

  1. School of Computer and Communication, Lanzhou Univ. of Tech., Lanzhou 730050, China)
  • Received:2021-05-21 Online:2022-12-28 Published:2023-03-21

摘要: 关键蛋白质的识别有助于了解细胞存活的基本需求,并为疾病治疗找到新方法,但是蛋白质自身携带着复杂的生物特性,仅依赖网络拓扑特性不能精准地判断其关键性.因此,提出一种新方法来提高识别关键蛋白质的准确率.首先,考虑网络拓扑特性以及蛋白质在不同亚细胞中的重要程度,定义了SNC方法;其次,利用蛋白质在亚细胞与复合物信息中的特性定义了SIDC方法;最后,通过融合网络拓扑结构和多源生物信息,提出了关键蛋白质识别算法CTB.在YDIP、YMIPS和Krogan数据集上利用精准率-查全率等多种评估方法进行实验,结果表明CTB算法提高了识别关键蛋白质的性能.

关键词: 蛋白质相互作用网络, 关键蛋白质, 亚细胞定位, 蛋白质复合物

Abstract: Identification of essential proteins helps to understand the basic needs of cell survival and find new methods for disease treatment. Protein itself carries complex biological characteristics, and it is impossible to accurately judge their criticality only by network topology. In this paper, a new method is proposed to improve the accuracy of identifying essential proteins. Firstly, considering the network topology and the importance of proteins in different sub-cells, an indicator called SNC is proposed. Secondly, SIDC indicatoris defined by using the characteristics of proteins in subcellular and complex information. Finally, based on the network topology and multi-source biological information, a new protein recognition algorithm CTB is proposed. Experiments on YDIP, YMIPS, Krogan data sets using accuracy recall and other evaluation methods show that the method in this paper improves theperformance of identifying key proteins.

Key words: protein interaction network, essential protein, subcellular location, protein complex

中图分类号: