Researchers at Oregon State University have developed a great new RNA annotation tool.

The software, bpRNA, is a big-data annotation tool for secondary structures in ribonucleic acids (RNA). RNA is essential for molecular scaffolding, gene regulation, and encoding proteins. The function of a particular RNA can usually be worked out from its secondary structure.

An example of the annotation provided by a new software tool for RNA secondary structure researchers (provided by David Hendrix, OSU College of Science).

Although there are over 100,000 known RNA structures out there in various databases, meta-databases just aren’t very good. bpRNA looks to fix that, as described in Nucleic Acids Research.

“It’s capable of parsing RNA structures, including complex pseudoknot-containing RNAs, so you end up with an objective, precise, easily-interpretable description of all loops, stems and pseudoknots,” said corresponding author David Hendrix. “You also get the positions, sequence and flanking base pairs of each structural feature, which enables us to study RNA structure en masse at a large scale.”

As you’ll know from this article, back in November, noncoding RNAs are anything but ‘junk’. This is something Hendrix is equally keen to point out, “There are plenty of examples of disease-associated mutations in noncoding RNAs that probably affect their structure, and in order to statistically analyze why those mutations are linked to disease we have to automate the analysis of RNA structure. RNA is one of the fundamental, essential molecules for life, and we need to understand RNAs’ structure to understand how they function.”

So what actually is bpRNA? Is it just a meta-database? “To be fair it’s a meta-database, but our special sauce is the tool to annotate everything. Before there was no way of saying where all the structural features were in an automated way. We provide a color-coded map of where everything is. These annotations will enable us to identify statistical trends that may shed light on RNA structure formation and may open the door for machine learning algorithms to predict secondary RNA structure in ways that haven’t been possible.”

Very positive, and exciting. The tool has already been tested on over 100,000 structures of varying complexity and performing exceptionally well.