Scientists decode mystery sequences involved in gene regulation
First-ever compendium of RNA sequences will be important guide to understanding the root of genetic diseases
Scientists know that much of what a gene does and produces is regulated after it is turned on. A gene first produces a molecule called RNA, to which tiny proteins called RNA binding proteins (RBPs) bind and control its fate. For instance, some of these proteins cut out parts of the RNA molecule so that it makes a particular protein, while other RBPs help destroy the RNA before it even produces a protein.
But these mechanisms are not well understood because the RNA sequences, which the RBPs bind to, have been so difficult to decipher. To fully understand gene regulation (and disregulation, as in the case of disease), scientists have needed to employ advanced lab techniques and data analysis to identify the patterns of the RNA sequences.
This gap in knowledge motivated a team of researchers co-led by Senior Fellow Tim Hughes (University of Toronto and the Canadian Institute for Advanced Research) to produce the first-ever compendium of RNA-binding sequences, which was published in Nature.
"It took us a long time to generate and analyze the data," explains Hughes. "After spending years developing and perfecting a method, we started looking at all the proteins in humans, fruit flies and other complex organisms that look like they may bind RNA and found which sequences they like to bind to. Our compendium of RNA-binding sequences will become a resource for researchers in this field, and will be especially useful in human genetic analysis."
The team found that humans and fruit flies have similar RBPs, since they derive from a common ancestor, and that in many cases they essentially bind the same sequences. The researchers anticipate that this is the case for proteins in other organisms.
"We looked at just over 200 proteins in total, but can probably infer the preference for tens of thousands of proteins in many other organisms," says Hughes.
In addition, many of the sequences similar across species were at the end of the RNA transcript, which is a region associated with regulation of RNA decay or movement of the RNA to another part of the cell. "This indicates that there is probably more regulation of gene expression itself at the level of stability or destruction of RNA," explains Hughes.
One of the major insights that came out of the team's analyses was about a well-studied protein called RBFOX1, which was already known to have a function in regulating RNA splicing and to be decreased in autism. The team's findings suggest that RBFOX1 has a role in regulating the expression level of nervous-system-related genes in brains with autism, and that it does so by making RNA more stable.
The underlying causes of disease are more complicated than a single gene not working right, says Hughes. He anticipates that the team's compendium will be useful in human genetic analysis.
"What often happens is that scientists identify a genetic variation associated with a disease, but then they don't understand why it leads to the disease. What exactly do these sequence changes cause? If the sequence is in a regulatory region of the RNA, then with our compendium, other scientists will be able to see what protein binds to it. This will give them a better idea of what is being disrupted."