Evaluation of the Influence of Computational Resources on Transcriptome de Novo Assembly Variability and Quality

Authors

  • Patricia Carvajal López Instituto Politécnico Nacional, Centro de Investigaciones Biológicas del Noroeste S.C.
  • Fernando Daniel Von Borstel Luna Instituto Politécnico Nacional, Centro de Investigaciones Biológicas del Noroeste S.C.
  • Joaquín Gutiérrez Jagüey Instituto Politécnico Nacional, Centro de Investigaciones Biológicas del Noroeste S.C.
  • Humberto Mejía Ruíz Instituto Politécnico Nacional, Centro de Investigaciones Biológicas del Noroeste S.C.
  • Gabriella Rustici Cambridge University, Genetics Department
  • Eduardo Romero Vivas Instituto Politécnico Nacional, Centro de Investigaciones Biológicas del Noroeste S.C.

DOI:

https://doi.org/10.13053/cys-22-4-2883

Keywords:

RNA, NGS sequencing, RNS-Seq, memory effect on assembly, HPC, assembly optimization

Abstract

RNA content is deciphered by random fragmentation of biomolecules, generating millions of sequences. In lack of references these sequences are reconstructed relying on algorithms that require intensive use of computational resources. Numerous factors affect this process. This study explores for the first time how memory/core allocation on reconstruction processes influences assembly quality and variability. Multiple de novo assemblies for two model organisms were obtained from one monolithic platform and two High Performance Computers. Low memory monolithic platforms observed greater variability (1.98 & 2.10 times greater than HPC); however, most of the obtained contigs (99.16% & 75.79%) mapped to the reference transcriptome, thus proving good quality. Therefore, contrary to what was expected, using low-resource equipment when applying assembly strategies that unify numerous assemblies outperforms HPCs on RNA discovery.

Author Biographies

Patricia Carvajal López, Instituto Politécnico Nacional, Centro de Investigaciones Biológicas del Noroeste S.C.

Estudiante de DoctoradoGrupo de Investigación en Bioinformática

Fernando Daniel Von Borstel Luna, Instituto Politécnico Nacional, Centro de Investigaciones Biológicas del Noroeste S.C.

Investigador Grupo de Investigación en Bioinformática

Joaquín Gutiérrez Jagüey, Instituto Politécnico Nacional, Centro de Investigaciones Biológicas del Noroeste S.C.

Investigador Grupo de Investigación en Bioinformática

Humberto Mejía Ruíz, Instituto Politécnico Nacional, Centro de Investigaciones Biológicas del Noroeste S.C.

Investigador Programa de Acuicultura

Gabriella Rustici, Cambridge University, Genetics Department

Bioinformatics Training Manager Department of Genetics

Eduardo Romero Vivas, Instituto Politécnico Nacional, Centro de Investigaciones Biológicas del Noroeste S.C.

InvestigadorGrupo de Investigación en Bioinformática

Published

2018-12-30