Tree-based Secondary Structure Interactions Predictor Method for Protein Contact Maps

Authors

  • Julio César Quintana-Zaez University of Ciego de Ávila "Máximo Gómez Báez", Ciego de Ávila, Cuba
  • Reynaldo Molina-Ruiz Central University "Marta Abreu" of the Villas, Santa Clara, Cuba
  • Cosme Ernesto Santiesteban-Toca University of Ciego de Ávila "Máximo Gómez Báez", Ciego de Ávila, Cuba

DOI:

https://doi.org/10.13053/cys-22-4-2827

Keywords:

Contact maps, folding patterns, decision trees, long-range contacts

Abstract

Understanding the folding of proteins is one of the most interesting research field for the Bioinformatics. The contact maps constitute an intermediate step in the prediction of the 3D structure of the proteins and allow to represent folding patterns. Currently, the methods used to predict contact maps achieve low precision results, only about 25% of long-range (L/5) contacts are correctly predicted, and their knowledge base is not humanly interpretable. In this paper, we propose an easy implementation multiple classifier for contact maps, which is based on patterns of interaction between secondary structures and employed decision trees as base classifiers. This method is able to naturally reduce the level of imbalance between contact/non-contact classes. In addition, a set of interpretable rules are extracted as a complement to the prediction. The validation of method performance shows that an average of 45% of general contacts are correctly predicted. Moreover, a Z-score comparison of its long-range contacts predictions (L/5) with participant methods in CASP11 competition shows that it is competitive with the state-of-the-art methods

Author Biography

Cosme Ernesto Santiesteban-Toca, University of Ciego de Ávila "Máximo Gómez Báez", Ciego de Ávila, Cuba

University of Ciego de Ávila “Máximo Gómez Báez”, Bioplantas Research Center

Downloads

Published

2018-12-30