Archive for the 'Rfam' Category

Rfam 12.3 is out

June 29, 2017

rfam_12_3_cover_720

The new Rfam release (version 12.3) features 101 new families, unified search, and updated documentation.

New families

Rfam 12.3 featured families

In this release 101 new families were added to the database, including over a dozen Yersinia pseudotuberculosis RNA thermometers from a recent PNAS paper by Righetti et al. We would like to thank Zasha Weinberg for contributing NiCo riboswitch, Type-P5 Twister, and several RAGATH RNAs (for example, RAGATH-5). You can browse the new families here.

Unified text search

Rfam text search

Over the years Rfam developed many specialised ways of searching and exploring the data, such as Keyword search, Taxonomy search, browsing entries by type, and “Jump To” navigation. While these options work well, they may be confusing for new users, so we set out to unify all search functionality in a single text search.

The new search is available on the Rfam homepage or at the top of any Rfam page and is powered by EBI search. It allows to browse RNA families, clans, motifs, or explore Rfam by category using facets. For example, one can view families with 3D structures or view all snoRNA families that match human sequences, and the URLs can be bookmarked or shared.

The new search is a full replacement for the old search functionality except for taxonomy, because the new search can find species but not higher-level taxa. For example, one can search for Homo sapiens but not for Mammals. Stay tuned for future updates and use the old Taxonomy search in the meantime. We plan to retire all old search functionality once the new search is fully developed but until then the old and the new searches will coexist.

For more information about the new search, see Rfam documentation. If you have any feedback, please let us know in the comments below, on GitHub, by email, or on Twitter.

New home for Rfam documentation

Rfam help has been migrated to a dedicated documentation hosting platform ReadTheDocs and is now available at http://rfam.readthedocs.org.

Rfam ReadTheDocs help

The new system offers several advantages:

The source code of the documentation is available on GitHub so if you notice a problem you can let us know by creating an issue or help us fix it by editing the text on GitHub and sending a pull request.

Other updates

  • Clan competition for PDB entries: Now the 3D structure tab, the public MySQL database, and the FTP archive show only the lowest E-value match when several RNA families from the same clan match a PDB chain. For example, chain 0 of PDB structure 1S72 (LSU rRNA from an Archaeon Haloarcula marismortui) now matches only the Archaeal LSU family instead of all families from rRNA LSU clan.
  • New 5S rRNA clan CL00113 that includes 5S rRNA and mtPerm-5S families.

What’s next

This release will be the last “point release” for Rfam 12. In the next few months we will release Rfam 13.0 which will be based on a new sequence database. Previously, Rfam annotated WGS and STD subsets of ENA, which grow very quickly and include many redundant sequences. We will take advantage of reference genomes from UniProt reference proteome collection which is a regularly updated, reduced-redundancy set of reference genomes. This allows us to perform meaningful taxonomic comparisons and explore RNA families by taxonomy without sifting through thousands of versions of the same genome.

Get in touch

As always, we welcome comments and feedback about Rfam, so feel free to get in touch by email or by submitting a new GitHub issue.

Advertisements

Rfam 12.2 is live

January 25, 2017

We are happy to announce a new release of Rfam (version 12.2) which includes 115 new families, introduces R-scape secondary structure visualisations, and restores missing families to multiple Rfam clans.

New families

This release adds 115 new Rfam families bringing the total number of families to 2,588. Notable additions include Pistol, Hatchet, Twister-sister and several other riboswitches contributed by Zasha Weinberg. We are always looking for new RNA families, so please feel free to get in touch with your suggestions.

Testing covariation with R-scape

R-scape is a new method for testing whether covariation analysis supports the presence of a conserved RNA secondary structure. In order to check the quality of Rfam structures, we ran R-scape on all Rfam seed alignments and added R-scape visualisations to the secondary structure galleries. For example, here is R-scape analysis of the SAM riboswitch:

r-scape-sam-riboswitch

According to R-scape, the secondary structure from the Rfam seed alignment, shown on the left, has 19 statistically significant basepairs (highlighted in green). R-scape can also use statistically significant basepairs as constraints to predict a new secondary structure that is consistent with the seed alignment. Using this approach, R-scape increased the number of statistically significant basepairs from 19 to 27 while also adding 9 new basepairs that are consistent with the seed alignment (structure on the right). This visualisation gives an idea about the quality of the Rfam structure and indicates that in this case it may need to be updated. To find out more about R-scape have a look at a recent paper by Rivas et al.

Tip: R-scape visualisations are interactive, so you can pan and zoom the structures and get additional information by hovering over nucleotides and basepairs.

R-scape analysis suggests that many existing Rfam secondary structures can be improved (for example, FMN riboswitch or 5S rRNA). In other families secondary structures are not supported by the R-scape covariation analysis (for example, oxyS RNA) which indicates that either their seed alignments need to be expanded or that these RNA families do not have a conserved secondary structure. Lastly, there are also cases where the R-scape structures do not show significant improvement compared to the current secondary structure (for instance, Metazoa SRP).

In future releases we will begin to improve existing Rfam seed alignments by using R-scape in the family building pipeline. In the meantime, Rfam users can get an indication of the quality of the structure using R-scape visualisations.

Recovering lost clan members

Since Rfam 10.0, related Rfam families have been organised into clans. The clans are manually curated and clan membership is checked using automated quality control steps (for example, to make sure that a family cannot belong to more than one clan). However, under certain circumstances these quality control procedures silently removed families from the clans. This bug was introduced in Rfam 11.0, and over time, more than 30 families were dropped from 20 clans, so that some clans did not have any families at all. The problem has now been fixed and proper clan membership has been restored using Rfam releases from the FTP archive. You can explore Rfam clans and let us know if you have any feedback.

Other updates

How to access the data

In addition to the Rfam website, you can access the data in the FTP archive and via the API. There is also a public MySQL database introduced in the last release.

What’s next

As well as revisiting Rfam seed alignments, work is underway on the next major Rfam release (13.0) which will be based on a new sequence database built from complete genomes. We plan to make the new data available in late 2017.

Get in touch

We always welcome comments and feedback about Rfam, so feel free to get in touch by email or by submitting a new GitHub issue.

Rfam 12.1 has been released

April 27, 2016

Rfam 12.1 announcement

We are happy to announce a new release of Rfam. Version 12.1, based on the same sequence dataset as Rfam 12.0, features over 20 new families, a new clan competing algorithm, a publicly accessible MySQL database, and many website fixes.

Read the rest of this entry »

The Rfam Track Hub is back

May 14, 2015

We are pleased to announce the return of the Rfam Track Hub for the UCSC Genome Browser. This hub is available on our ftp site. The hub prodives annotation for the most recent assemblies eight different species at present: Human (hg38), Mouse (mm10), C.elegans (ce10), Chicken (galGal4), C. intestinalis (ci2), Zebrafish (danRer7), Drosophila (dm6) and S. cerevisiae (sacCer3).

Read the rest of this entry »

Rfam 12.0 is out

September 24, 2014

We are pleased to announce the release of Rfam 12.0! Read the rest of this entry »

Moving to xfam.org

May 1, 2014

Back in November 2012 we announced that the Xfam team in the UK was moving from the Wellcome Trust Sanger Institute to the European Bioinformatics Institute (EMBL-EBI), just next door on the Wellcome Trust Genome Campus. On Tuesday we completed that move by switching off the Pfam and Rfam websites inside Sanger and redirecting all traffic to our shiny new home at xfam.org. You can now find the Pfam and Rfam websites at pfam.xfam.org and rfam.xfam.org respectively. Read the rest of this entry »

Join Rfam, see the world

January 31, 2014

Rfam is recruiting! We are currently recruiting an RNA informatician to join our team. We’re looking for someone really enthusiastic about RNA and who’s interested in working with Rfam as we move to genome-based alignments and explore new technologies for the database and website.

If this is you, why not apply to join us as a Senior Bioinformatician?

We’ve moved, now the websites

January 30, 2014

In November 2012, we announced that the Xfam groups were moving the few tens of metres from the Wellcome Trust Sanger Institute to the European Bioinformatics Institute. We warned you then, that the websites would also eventually move. Read the rest of this entry »

The Rfam NAR paper is now available!

November 23, 2012

For some light weekend reading, have a look at the latest Rfam paper, Rfam 11.0: 10 years of RNA Families.  It’s part of the 2013 Nucleic Acids Research Database issue, and you’ll find all the latest developments to Rfam mentioned, including the sunbursts, the Biomart and an update on the Wikipedia annotation effort.

R-chie arc diagrams now available in our secondary structure galleries

November 19, 2012

We are pleased to announce the inclusion of R-chie arc diagrams in the Rfam family secondary structure galleries. We think these images are beatiful and intuitive ways of visualising complex RNA secondary structures, and we hope that you find them as useful as we do. You can find the R-chie tab in the secondary structure image gallery for each family; from there you can zoom in and out of the images, as well as viewing the image in a seperate window. The majority of Rfam families have R-chie images; those which don’t are families without secondary structure. Have a look at the U1 spliceosomal RNA, or tRNA for examples.

The R-chie diagrams are created using the R4RNA R package from Irmtraud Meyer’s group; be sure to check out the R-chie paper, as well as their own gallery of Rfam structures.