Evaluation of the Influence of Computational Resources on Transcriptome de Novo Assembly Variability and Quality
Abstract
RNA content is deciphered by random fragmentation of biomolecules, generating millions of sequences. In lack of references these sequences are reconstructed relying on algorithms that require intensive use of computational resources. Numerous factors affect this process. This study explores for the first time how memory/core allocation on reconstruction processes influences assembly quality and variability. Multiple de novo assemblies for two model organisms were obtained from one monolithic platform and two High Performance Computers. Low memory monolithic platforms observed greater variability (1.98 & 2.10 times greater than HPC); however, most of the obtained contigs (99.16% & 75.79%) mapped to the reference transcriptome, thus proving good quality. Therefore, contrary to what was expected, using low-resource equipment when applying assembly strategies that unify numerous assemblies outperforms HPCs on RNA discovery.
Keywords
RNA, NGS sequencing, RNS-Seq, memory effect on assembly, HPC, assembly optimization