More than 100,000 unknown viruses have been discovered using a new computer tool
Knowledge of viral diversity will reveal the origins of emerging pathogens and improve surveillance and mitigation of new pandemics
UPV
To carry out this analysis, the multidisciplinary team developed Serratus, a cloud computing (Amazon Web Services, AWS) infrastructure that, using a cluster of 22,500 computer processors (CPUs), enabled massive searches for viral sequences in the millions of Gigabytes (Petabytes) of sequencing data available in public databases.
Detailed analysis of certain viral families led to the discovery of more than 30 new coronavirus species, including interesting examples in aquatic vertebrates such as fish and amphibians whose coronaviruses had a genome segmented into two fragments, a feature that has been described in other virus families but had not previously been detected in any coronavirus.
At the Institute for Plant Molecular and Cellular Biology, located in the Polytechnic City of Innovation, UPV scientists used Serratus to analyse the virus that causes human hepatitis D, a viral agent called Delta, of minimal genomic size and unknown origin. This allowed the CSIC researcher at the IBMCP Marcos de la Peña Rivero to detect similar viruses in a multitude of other animals, including not only mammals and other vertebrates but also invertebrates. "Surprisingly, these viruses were also found in environmental samples collected from lakes and soils all over the world, and their hosts are unknown for the time being," reveals De la Peña.
Evolutionary connection between human and plant viruses in the environment
Moreover, environmental samples with hepatitis D-like viruses revealed the presence of novel viral forms with ultra-compact genomes of minute size (only 300 bases, the chemical units that make up the genetic material). "This discovery allows us to advance a close evolutionary connection between viruses as distant as human hepatitis D and plant subviral agents called viroids," says the CSIC researcher.
Both the database of all the viruses obtained in the course of this study and the set of tools developed are freely and openly available (http://www.serratus.io). These tools can be of great use in characterising the diversity of all viruses existing in our planet and in preparing the world for possible new pandemics, the devastating consequences of which we are now suffering with emerging viral diseases such as COVID-19, caused by the SARS-CoV-2 coronavirus.