Rfam 12.1 has been released

April 27, 2016

Rfam 12.1 announcement

We are happy to announce a new release of Rfam. Version 12.1, based on the same sequence dataset as Rfam 12.0, features over 20 new families, a new clan competing algorithm, a publicly accessible MySQL database, and many website fixes.

Improved algorithm for clan competing

One of the main improvements of this release is the removal of redundancy between families belonging to the same clan. Rfam clans (introduced in Rfam 10.0) group together related Rfam families, for example, large ribosomal subunits from Bacteria, Archaea, and Eukaryota are all found in the LSU clan. Families from the same clan often match the same sequence region, so in Rfam 12.0 a ‘clan competing’ procedure was introduced to keep only the best match. However, there were cases where clan competing did not work well (or simply was not applied) and some sequences were annotated with multiple Rfam families belonging to the same clan. For example, some human sequences appeared in both Protozoa and Metazoa SRP families. Now the algorithm for clan competition has been revised and the redundant matches are eliminated from the database.

As a result, in release 12.1 there is actually a significant drop in the number of annotated ncRNA regions – down from ~19 million in release 12.0 to ~9 million in 12.1. This drop is primarily down to the removal of the redundant annotations between the largest families found in clans, such as rRNA subunits and SRPs.

Pseudoknots are back

Previously pseudoknots were removed from Rfam consensus secondary structures for technical reasons, but thanks to Eric Nawrocki pseudoknots have been restored to families. You can get a file with seed alignments, including those with pseudoknots, from our FTP archive.

New families

We added 23 new families bringing the total number of families to 2473 (Rfam identifiers RF02545 to RF02567). We always welcome suggestions for new Rfam families, so feel free to get in touch.

Public MySQL Database

In order to make it easier to query the data in ways that are not supported by the website, we have created a public MySQL database with the latest Rfam data. This replaces the retired BioMart interface. Now you can explore the data using SQL queries from your favourite MySQL client or programmatically using custom scripts. The MySQL database will be updated with each Rfam release. For more information about how to access the database and examples please have a look at the database documentation.

R-chie diagrams

Thanks to those of you who reported problems with R-chie diagrams on the Rfam website. These visualisations are useful for exploring seed alignments together with consensus secondary structures, but in many Rfam diagrams alignment columns did not match the corresponding secondary structure arches, as can be seen in the interactive before/after comparisonFor examples of updated R-chie diagrams, have a look at tRNAs or a pseudoknot from Yellow Fever virus.

New team

Release 12.1 marks the first release produced by the new EMBL-EBI RNA Resources team which is led by Anton Petrov and includes Ioanna Kalvari (Software Developer) and Joanna Argasinska (Biocurator). The team, housed within Rob Finn’s group and jointly coordinated by Alex Bateman, is responsible for Rfam and RNAcentral databases, which will result in tighter coordination between the two resources in the future.

Plans for Rfam 13.0

We plan to release the next version of Rfam towards the end of 2016. The key feature of release 13.0 will be the new genome-centric organisation of the underlying sequence database. Instead of searching the WGS and STD datasets from ENA we will search a representative set of genomes based on reference proteomes computed by UniProt. This will ensure comprehensive annotation of most important genomes, enable a faster release cycle and decrease redundancy in sequence data. Work on release 13.0 is already underway so stay tuned!

Get in touch

We always welcome comments and feedback about Rfam, so feel free to get in touch by email, on Twitter, or by submitting
a new
GitHub issue.

7 Responses to “Rfam 12.1 has been released”

  1. Paul Gardner Says:

    :’-(

    $mysql –user rfamro –host mysql-rfam-public.ebi.ac.uk –port 4497 –database rfam
    ERROR 1044 (42000): Access denied for user ‘rfamro’@’%’ to database ‘rfam’

    • antonipetrov Says:

      The database name starts with a capital “R”, so this should work:
      $mysql –user rfamro –host mysql-rfam-public.ebi.ac.uk –port 4497 –database Rfam

      Thank you for pointing this out! The documentation has been updated accordingly.

  2. Ivan Antonov Says:

    Hello — I have problems with accessing the Rfam from Russia. If I click on any link on the http://rfam.xfam.org/ webpage I get redirected http://ves-pg-b7.ebi.ac.uk/ and my browser says “This site can’t be reached”. I am wondering if anyone else experiences the same problem.

    Thanks,
    Ivan

    • Anton Petrov Says:

      Hi Ivan,

      Thank you for reporting the problem! We experienced a networking problem earlier today but it should be fixed now. Could you please refresh your browser (in case the incorrect version of the page is still cached) and try again? If this doesn’t work, I would be happy to look into this further.

      Anton


Leave a reply to Anton Petrov Cancel reply