English  |  正體中文  |  简体中文  |  Items with full text/Total items : 43312/67235
Visitors : 2024772      Online Users : 2
RC Version 5.0 © Powered By DSPACE, MIT. Enhanced by NTU/NCHU Library IR team.

Please use this identifier to cite or link to this item: http://nchuir.lib.nchu.edu.tw/handle/309270000/153803

標題: 蛋白質四級結構的分類:使用拔靴法選擇效能最佳之模型
Protein Quaternary Structure Classification: Using Bootstrapping for Model Selection
作者: 何紹瑜
Ho, Shao-yu
Contributors: 朱彥煒
基因體暨生物資訊學研究所
關鍵字: 多聚體分類;拔靴法;模型選擇;機器學習
polymer classification;bootstrap method;model selection;machine learning
日期: 2012
Issue Date: 2013-11-19 12:02:57 (UTC+8)
Publisher: 基因體暨生物資訊學研究所
摘要: 蛋白質四級結構複合物在細胞中多聚體結構各自扮演不同重要角色,像是具有二聚體結構的轉錄因子參與著基因調控,而三聚體結構的病毒感染相關醣蛋白則與人類免疫系統缺陷病毒相關,因此若能分類蛋白質四級結構複合物,對於後基因體時代的蛋白質體學研究是有相當大的幫助。現今針對研究單體與多聚體序列的預測系統並不普遍。因此,本研究設計兩層機器學習的架構,發展蛋白質四級結構複合物分類預測系統PClass。將蛋白質四級結構複合物分為五類包括單體、二聚體、三聚體、四聚體及其他亞基類,第一層在拔靴法架構下配合support vector machine提出新的模型選擇方法,每類複合物以序列組成、entropy及accessible surface area 之特徵編碼,產生多個特徵模組透過評估的方式挑選效能最佳模型作為每類複合物的特徵模組,準確度可以達到馬修斯相關係數70%以上。接著第二層的建構結合了第一層特徵模組進行整合機制並利用六種機器學習方法改善預測效能,使得每類複合物預測準確度皆能提升10%以上。最後,以二聚體結構的轉錄因子與三聚體結構的病毒感染相關醣蛋白實際驗證預測系統。
Protein quaternary structure complex is also known multimer, which plays an important role in the cell. Such as dimer structure of the transcription factor involved in gene regulation, but trimer structure of the virus infection associated glycoprotein is related to the system with the human immunodeficiency virus. Therefore, if we can classification the protein quaternary structure complex for post genome era of proteomics research is of great help. Nowadays, the classification systems among protein quaternary structures have not been widely developed yet, therefore, in this study, we designed the architecture of the two layer machine learning and developed the classification system, PClass. Protein quaternary structure of the complex is divided into five categories, including monomer, dimer, trimer, tetramer and other subunits class. The first layer in the framework of the bootstrap method with support vector machine to propose a new model selection method, each type of complex according to sequences, entropy and accessible surface area as the feature encoding, generating a plurality of feature models and through the evaluation way to select the optimal model of effectiveness as each kind of complex feature model. In this stage, the best performance can reach as high as 70% of MCC. Then the second layer construction combines the first layer model to integrate mechanisms and use of six machine learning methods to improve the prediction performance, this system can be improved over 10% in MCC. Finally, we analyzed the performance of our classification system by transcription factor in dimer structure and virus infection associated glycoprotein in trimer structure.
Appears in Collections:[依資料類型分類] 碩博士論文

Files in This Item:

File SizeFormat
index.html0KbHTML131View/Open


 


學術資源

著作權聲明

本網站為收錄中興大學學術著作及學術產出,已積極向著作權人取得全文授權,並盡力防止侵害著作權人之權益。如仍發現本網站之數位內容有侵害著作權人權益情事者,請權利人通知本網站維護人員,將盡速為您處理。

本網站之數位內容為國立中興大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用。

聯絡網站維護人員:wyhuang@nchu.edu.tw,04-22840290 # 412。

DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU/NCHU Library IR team Copyright ©   - Feedback