论文检索
期刊
全部知识仓储预印本开放期刊机构
高级检索

基于深度学习和多组学数据的肺腺癌分期预测研究OA

Stage prediction of lung adenocarcinoma based on deep learning and multi-omics data

中文摘要英文摘要

为解决癌症分期难以精准决策这一问题,对452例肺腺癌患者的信使核糖核酸(mRNA)转录数据、微核糖核酸(miRNA)转录数据和DNA甲基化3种组学数据进行集成融合,并采用随机森林算法进行分期预测.首先对从癌症基因组图谱(TCGA)数据库获取的 3种组学数据进行预处理,将mRNA转录数据和DNA甲基化数据进行基因位点匹配,再使用4种不同的多组学集成策略对预处理后的组学数据进行集成,最后使用随机森林算法对集成后的数据进行分期预测并使用准确度、卡帕系数以及曲线下面积(AUC)作为预测效果的评价指标.研究结果显示,采用多组学集成策略在分期预测上具有更高的准确率,其中基于深度学习的集成策略的预测效果最好,评价指标分别为0.940、0.931和0.986,有希望应用于未来的肺腺癌分期预测中.

To improve accuracy in decision-making in cancer staging,this study integrated three kinds of omics data,including messenger ribonucleic acid(mRNA)transcript data,micro ribonucleic acid(miRNA)transcript data and DNA methylation,from 452 lung adenocarcinoma patients,and used random forest algorithm to predict stages.First,three kinds of omics data obtained from the cancer genome altas(TCGA)database were preprocessed and the mRNA sequencing data were matched up with DNA methylation data at gene loci,then four different multi-omics integration strategies were adopted to integrate the preprocessed data,and finally a random forest algorithm was applied to the integrated data for the prediction of staging,and accuracy,Kappa coefficient and the area under the curve(AUC)were used to evaluate the performance of the prediction.The results show that adoption of the multi-omics integration strategies can achieve high accuracy.The integration strategy based on deep learning is considered as the most effective one,with accuracy,Kappa coefficient and AUC values of 0.940,0.931 and 0.986,respectively,and it can offer relevant guidance for the lung adenocarcinoma staging prediction in the future.

刘德真;李圆媛

武汉工程大学光电信息与能源工程学院、数理学院,湖北 武汉 430205

计算机与自动化

肺腺癌分期;深度学习;集成策略;随机森林算法

staging of lung adenocarcinoma;deep learning;integration strategy;random forest algorithm

《武汉工程大学学报》 2024 (002)

190-196 / 7

国家自然科学基金(12001408)

10.19843/j.cnki.CN42-1779/TQ.202307022

评论

下载量:0
点击量:0