Posts Tagged ‘rfam’

Join Rfam team!

March 19, 2021

We are looking for a Software Developer to join the Rfam team and contribute to the world’s largest database of RNA families. The post holder will be responsible for keeping Rfam up-to-date, developing Rfam Cloud, and improving the website. More information about the position can be found at

Apply now or help spread the word. Closing date: April 20th, 2021.

Rfam 14.5 is live

March 18, 2021

We are happy to announce a new Rfam release, version 14.5, featuring 112 updated microRNA families and 10 families improved using the 3D structure information. Read on for details or explore 3,940 RNA families at

Updated microRNA families

As described in our most recent paper, we are in the process of synchronising microRNA families between Rfam and miRBase. In this release 112 of the existing microRNA families have been updated with new manually curated seed alignments from miRBase, new gathering thresholds, and new family members found in the Rfamseq sequence database. 

In total, 852 new microRNA families have been created (356 in release 14.3 and 496 in release 14.4) and 152 existing families have been updated (40 in release 14.3 and 112 in release 14.5). As the miRBase-Rfam synchronisation is about 50% complete, additional microRNA families will be made available in the upcoming releases. You can view a list of the 112 updated families or browse all 1,385 microRNA families on the Rfam website. 

Updating families using information from 3D structures

We are also in the process of reviewing the families with the experimentally determined 3D structures in order to compare the Rfam annotations with the 3D models. Our goal is to incorporate the 3D information into Rfam seed alignments as many families have been created before the corresponding 3D structures became available. We manually review each PDB structure, verify basepair annotations from matching PDBs, and obtain a more consistent consensus secondary structure model. 

In multiple cases we were able to add missing base pairs and pseudoknots. For example, in the SAM riboswitch (RF00162), we added two base pairs in the base of helix P2, corrected a basepair in P3 and added four basepairs in P4 (one in the base of the helix and three near the terminal loop). The updated consensus secondary structure presents a more accurate central core annotation with more structure in the four-way junction.

SAM riboswitch secondary structure before and after the updates

In another example, one base pair was added in P1 and another one in P3 of the SAM-I/IV variant riboswitch (RF01725). We also corrected a base pair in P3 and included a P4 stem loop that was not integrated before.

SAM-I/IV riboswitch secondary structure before and after the updates

The SAM-I/IV riboswitch is characterised by a similar SAM binding core conformation to that of the SAM riboswitch but it differs in the k-turn motif in P2 which is found in SAM riboswitches but not in SAM-I/IV. These two families also have different pseudoknots interactions, where SAM riboswitch forms a pseudoknot between a P2 loop and the stem of P3, while the SAM-I/IV riboswitch contains a pseudoknot between a P3 loop and the 5′ region.

The first 10 families updated with 3D information include:

  1. RF00162 – SAM riboswitch
  2. RF01725 – SAM-I/IV variant riboswitch
  3. RF00164 – Coronavirus 3’ stem-loop II-like motif (s2m)
  4. RF00013 – 6S / SsrS RNAP
  5. RF00003 – U1 spliceosomal RNA
  6. RF00015 – U4 spliceosomal RNA
  7. RF00442 – Guanidine-I riboswitch
  8. RF00027 – let-7 microRNA precursor
  9. RF01054 – preQ1-II (pre queuosine) riboswitch and
  10. RF02680 – preQ1-III riboswitch

We will continue reviewing the families with known 3D structure in future releases.

Other family updates

Initially reported by Aspegren et al. 2004, Class I (RF01414) and Class II (RF01571) RNAs were found in social amoeba Dictyostelium discoideum and later on investigated in more detail by Avesson et al. 2011. Now a new report from Kjellin et al. 2021 presents a comprehensive analysis of the Class I RNA genes in dictyostelid social amoebas. Based on this study, we updated the Dicty Class I RNA family RF01414 with a new seed alignment and removed the family RF01571, thus merging both families into one. We thank Dr Jonas Kjellin (Uppsala University) for suggesting this update.

Goodbye Ioanna!

Rfam 14.5 is the last release prepared by Dr Ioanna Kalvari who will be leaving the team at the end of March 2021. We would like to take the opportunity to thank Ioanna for her contributions over the last 5.5 years and wish her best of luck in the future!

Get in touch

As always, we would be very happy to hear from you if you have any feedback or suggestions for Rfam. Please feel free to email us or get in touch on Twitter

Rfam 14.4 is live

December 18, 2020

The last Rfam release of 2020 is now live! Rfam 14.4 contains 496 new microRNA families developed in collaboration with miRBase. Find out about the microRNA project in our new NAR paper and let us know if you have any feedback.

Rfam 14.3

September 15, 2020

Rfam 14.3 includes 356 new and 40 updated microRNA families, as well as 12 new and 2 updated Flavivirus RNAs. Find out the details in our new NAR paper and get in touch if you have any feedback.

Rfam Coronavirus Special Release

April 27, 2020

In response to the SARS-CoV-2 outbreak, the Rfam team prepared a special release dedicated to the Coronavirus RNA families. The release 14.2 includes 10 new and 4 revised families that can be used to annotate the SARS-CoV-2 and other Coronavirus genomes with RNA families.

View the data at ➡️

New Coronavirus Rfam families

In collaboration with the Marz group and the EVBC, we created 10 families representing the entire 5’- and 3’- untranslated regions (UTRs) for Alpha-, Beta-, Gamma-, and Delta- coronaviruses. A specialised set of alignments for the subgenus Sarbecovirus is also provided, including the SARS-CoV-1 and SARS-CoV-2 UTRs. 

The families are based on a set of high-quality whole genome alignments produced with LocARNA and reviewed by expert virologists. Note that the Alpha-, Beta-, and Deltacoronavirus alignments and structures were refined based on the literature, while the Gammacoronavirus families are based on prediction alone due to the lack of experimental data.

5’ UTR3’ UTR
Sarbecovirus and SARS-CoV-2Sarbecovirus-5UTR

Previously, only fragments of the UTRs were found in Rfam. In particular, two families were superseded by the new whole-UTR alignments and removed from Rfam:

  • RF00496 (Coronavirus SL-III cis-acting replication element): This family represented a single stem that is now found in aCoV-5UTR and bCoV-5UTR families.
  • RF02910 (Coronavirus_5p_sl_1_2): This family represented two stems from aCoV-5UTR.

The new families are grouped into 2 clans: CL00116 and CL00117 for the 5’ and 3’ UTRs, respectively. The clans can be used with the Infernalcmscan program to automatically select the highest scoring match from a set of related families (see the Rfam chapter in CPB to learn more).

Revised Coronavirus families

We also reviewed and updated the existing Coronavirus Rfam families.

FamilyWhat was updated?Is it found in SARS-CoV-2?
RF00182 Coronavirus packaging signal The seed alignment and consensus secondary structure were updated to include the 4 conserved repeat units. This RNA element isfound only in Embecovirus, so it is not present in SARS-CoV-2 and other Sarbecoviruses.
RF00507 Coronavirus frameshifting stimulation elementThe seed alignment was expanded. This RNA is present in SARS-CoV-2.
RF00164 Coronavirus s2m RNAThe seed alignment was expanded.

There is a 3D structure for SARS-CoV-1 which can be used for understanding the s2m in SARS-CoV-2.
This RNA is present in the 3′ UTR of SARS-CoV-2.
RF00165 Coronavirus 3’-UTR pseudoknot  The seed alignment was expanded. 
The pseudoknot is annotated in the 3’ UTR families but since it is mutually exclusive with the 3’-UTR consensus structure, it is also provided as a separate family. 
This RNA is present in the 3′ UTR of SARS-CoV-2.

Where to get the data 

You can download the covariance models, as well as seed alignments for the coronavirus families from the corresponding family pages or from a dedicated folder on the FTP archive.

How to use the data

You can download the covariance models and annotate viral sequences with these RNA models using Infernal. See Rfam help for examples.

Inviting all Wikipedians to contribute

We revised the Wikipedia pages associated with each family, and we invite everyone to contribute to the following articles:


We would like to thank Kevin Lamkiewicz and Manja Marz (Friedrich Schiller University Jena) for providing the curated alignments for the new families as well as Eric Nawrocki (NCBI) for revising the existing Rfam entries. We also thank Ramakanth Madhugiri (Justus Liebig University Giessen) for reviewing the Coronavirus UTR alignments.

This work is part of the BBSRC-funded project to expand the coverage of viral RNAs in Rfam. More data on SARS-CoV-2 can be found on the European COVID-19 Data Portal.

Rfam 14.1 is out

January 28, 2019

We are happy to announce that a new Rfam release is now available! Rfam 14.1 includes 226 new families bringing the total number of Rfam families to 3,016. In addition, the R-scape visualisations have been updated to display pseudoknots, both manually annotated in seed alignments and predicted by R-scape (see below for details).

New families

The majority of the new families were contributed by Dr Zasha Weinberg (University of Leipzig) and were discovered by a systematic computational analysis of intergenic regions in Bacteria and metagenomic samples (see the NAR paper for more details). Many of the families come from environmental samples, so importing them into Rfam required a new procedure (described below).

This release features many families with statistically significant covariation (highlighted in green in the images below), for example Skipping-rope, Drum, and LOOT:

as well as a new unusually large, highly-structured RNA called ROOL that is found in Firmicutes, Fusobacteria and Tenericutes phylae as well as in phages and cow rumen metagenomic samples:

Browse new families in Rfam

Analysing pseudoknots using R-scape

Developed by Dr Elena Rivas (Harvard University), R-scape is a program that detects covariation support for structural pairs in RNA alignments (see the 2017 paper by Rivas et al in  Nature Methods for more details). Starting with version 1.2.0, R-scape systematically identifies pseudoknots supported by covariation (Rivas & Eddy, in preparation). For example, here is a pseudoknot from the SAM riboswitch that is not yet annotated in the Rfam seed alignment (left) but is correctly predicted by R-scape (right):

The nucleotides forming the pseudoknot are labelled pk_1, pk_2, pk_3 and so on in the structural annotation. Each pseudoknot is shown as a separate stem in an inset, and the basepairs with significant covariation are colored green similar to the other R-scape diagrams.

We are working on adding more pseudoknot annotations to the existing families based on the evidence from R-scape, 3D structures, and scientific literature. Please let us know if your favourite RNA is missing a pseudoknot.

Using RNAcentral identifiers in Rfam seed alignments

In previous releases, every sequence in every Rfam seed alignment was required to have an INSDC identifier assigned by a sequence archive like ENA or GenBank. However, when Rfam users submit their alignments to Rfam, they often include sequences that are not yet found in ENA or GenBank, especially if the sequences come from environmental samples. For example, sequence LV_Brine_h2_0102_1073789 from the MDR-NUDIX RNA does not exist in ENA so it does not have a stable identifier and is not associated with metadata such as NCBI taxid, description, or scientific literature.

In the past Rfam replaced such sequences with closely related ones or removed them altogether which required modifying the user-submitted alignments and could result in smaller, less informative seeds missing some covariation compared to the originals. In this release we implemented a new procedure that accepts RNAcentral identifiers in Rfam seed alignments in order to preserve the manually curated alignments as much as possible.

We began by importing the sequences and metadata from a recently established ZWD database (Zasha Weinberg Database) into RNAcentral where each distinct sequence is assigned a stable identifier (URS id) and linked to a NCBI taxid, its parent ZWD alignment, and scientific literature. For example, sequence LV_Brine_h2_0102_1073789 is assigned RNAcentral id URS0000D661D6_12908 so that it can be easily tracked using RNAcentral search, API, public database, or bulk download files.

Next we replaced the ZWD identifiers with RNAcentral accessions and used the ZWD-RNAcentral alignments as seeds for new Rfam families:

Following the standard Rfam protocol, we manually selected bit-score thresholds for each family that allow reliable identification of sequences from the seed alignments and other homologs from the Rfam sequence database.

A small number of sequences still had to be removed from ZWD alignments in the following cases:

  1. If a covariance model built using the alignment could not find some of its own sequences, these unmatched sequences were removed from the alignment
  2. If a sequence scored worse than a set of random sequences that serve as control when setting bit-score thresholds, such low-scoring sequences were also removed from the alignments.

In future releases we plan to expand the usage of RNAcentral identifiers in Rfam seed alignments.

Please note that any software that parses Rfam seed alignments and uses ENA or GenBank for metadata lookup will now need to include RNAcentral identifiers using the RNAcentral API. For more information or if you have any questions, please contact the RNAcentral team or Rfam help.

11 more families with 3D structure

There are 11 additional Rfam families that match 3D structures bringing the total number of families with experimentally determined structures to 98 (compared with 87 in Rfam 14.0).

Rfam familyPDB structures
RF00009 (RNaseP_nuc)6agb and 6ah3 (yeast), 6ahr and 6ahu (human) [chains A]
RF00025 (Telomerase-cil)6d6v (chain B)
RF00027 (let-7)5zal (chain C), 5zam (chain C)
RF00080 (yybP-ykoY)6cc1 (chains A and B), 6cc3 (chains A and B)
RF00233 (Tymo_tRNA-like)6mj0 (chains A and B)
RF00250 (mir-TAR)6gml (chain P)
RF00390 (UPSK)6mj0 (chains A and B)
RF01727 (SAM-SAH)6hag (chain A)
RF01826 (SAM_V)6fz0 (chain A)
RF02348 (tracrRNA)6mcb (chain B), 6mcc (chain B)
RF02553 (YrlA)6cu1 (chain A)

Other updates

Two existing families were updated with new seed alignments from ZWD, including RF02440 (ldcC RNA) and RF02840 (Lacto-3 RNA). There is also a new clan DUF805 (CL00115) that includes DUF805 and DUF805b families.


The Rfam team would like to thank Dr Elena Rivas and Dr Zasha Weinberg for the new data, software, and feedback, as well as the organisers and participants of the 2018 Benasque RNA meeting. We would also like to thank BBSRC for funding Rfam between 2015 and 2018.

Get in touch

Follow Rfam on Twitter to find out about new Rfam families and don’t hesitate to raise a GitHub issue or email us if you have any questions.

Rfam 14.0 is out with over 100 new families and an expanded genome collection

August 8, 2018


We are happy to announce that the new release of Rfam, version 14.0, is now available! Rfam 14.0 is built using a set of over 14,000 non-redundant, representative, and complete genomes (~60% more than in Rfam 13.0). It includes 105 new families, new genome browser hub, and ORCiD integration. Read on to find out more.

What’s new

Data updates

Rfam 14.0 has 60% more genomes than Rfam 13.0

The latest Rfam version comes on the heels of Rfam 13.0, a release that marked the transition to the genome-centric sequence database. In Rfam 13.0, the Rfam sequence database – Rfamseq – was composed of 8,364 non-redundant, representative and complete genomes derived from a genome collection maintained by UniProt. Now with the addition of 6,519 new species, the number of annotated genomes in Rfam 14.0 increased by ~60% to 14,434 genomes.

Screen Shot 2018-08-08 at 3.17.20 PM

The majority of the genomes from Rfam 13.0 are also present in Rfam 14.0, although a small number (385, ~4.6%) was removed or replaced. The majority of the new genomes come from Bacteria and Viruses.

Since Rfamseq was updated, this is a major Rfam release (14.0). Expect a minor release (14.1) in the Fall 2018 with new RNA families but no changes in Rfamseq.

More genomes, less redundancy

The switch to annotating complete genomes enabled us to resolve data redundancy at the levels of sequence and species. For instance, in Rfam 12.3 the cumulative length of all human sequences was eight times longer than the total length of the human genome assembly hg38 in 13.0 (note how the width of the green line of Rfam 12 narrows in Rfam 13).

Screen Shot 2018-08-08 at 3.17.45 PM

Redundancy reduction at species level relies on Uniprot’s reference proteome collection, which is a result of manual curation and computational refinement. It includes species of high interest to the scientific community and well-studied model organisms, carefully selected in such a way that they represent the taxonomic diversity. Rfam uses the same collection of genomes for annotation with existing RNA families and building new ones.

105 new families

The number of RNA families reached 2,791 with the addition of 105 new families from 8 RNA types. The new families per ncRNA type in release 14.0 is shown below:

  • 65   Gene; sRNA;
  • 17   Gene; antisense;
  • 11   Gene; snRNA; snoRNA; HACA-box;
  • 5     Gene; snRNA; snoRNA; CD-box;
  • 4     Cis-reg; thermoregulator;
  • 1     Cis-reg;
  • 1     Cis-reg; leader;
  • 1     Cis-reg; riboswitch;

Browse 105 new families

New 3D structures matching Rfam families

2 more Rfam families now have experimentally determined 3D structures that did not match any 3D structures in the past:

Rfam family PDB structure
RF00382 DnaX ribosomal frameshifting element 5UQ7, 5UQ8 – 70S ribosome complex with dnaX mRNA stemloop and E-site tRNA (“in” and “out” conformation)
RF00375 HIV primer binding site 6B19 – Architecture of HIV-1 reverse transcriptase initiation complex core

Search for Rfam entries in PDBe

Rfam regularly updates the mapping between Rfam families and the experimentally determined 3D structures available in PDB. With PDBe’s Advanced Search release in May 2018, PDBe users can take advantage of these mapping by searching with Rfam family names or accessions. For instance, a search using tRNA accession RF00005 currently retrieves 502 entries.

Another powerful new feature is the interactive 3D visualization of the Rfam domains on PDBe entry pages using LiteMol. This is achieved by highlighting the RNA sequence on the corresponding structure, for example tRNA (RF00005) in structure 4UJD. Additional information can be found in the PDBe blog post.

Increased GO term coverage

Non-coding RNA functional annotation was improved with the addition of 133 GO terms to 81 families since last release. The GO annotations are propagated to RNAcentral sequences and submitted to the GOA system, as described in GOREF:0000115.

Genome browser hub

The genome-centric sequence database enabled us to generate the genome browser track hub directly out of the genome annotations without an additional mapping step. At this time we limited the species listed in track hub to those supported by UCSC, with the potential of that number to grow by incorporating all genomes with assemblies at chromosome level. Currently there are 14 species including human (hg38), chicken (galGal5), pig (susScr11) and mouse (mm10). Upon user request, we will also be happy to provide .bed and .bigBed files for various other genomes in our collection, depending on the level of the assembly.

Explore Rfam annotations in UCSC Genome Browser by clicking on these links:

or configure the track manually by editing the URL:

The track hub can also be attached to Ensembl using these instructions and the following URL:

Get credit for Rfam families using ORCiD

It is now possible for Rfam authors to get credit for their contributions by claiming family accessions directly to their ORCiD profiles. This new feature was enabled by the Claim to Orcid functionality provided by EBI Search. The process includes three simple steps. Users are first required to login to their ORCiD accounts and use their ORCiD id to search for associated entries. Following search, one can manually select all or a subset of listed entries and click on Claim to ORCID button located at the top of the page. The example provided is of a snoRNA family (RF02725) claimed by the Rfam curator Joanna Argasinska directly to her ORCiD profile.   

New Rfam paper

rfam-cpb-paperWe recently published a new paper in Current Protocols in Bioinformatics with examples covering a broad spectrum of Rfam use cases including examples using our website as well as Infernal to annotate nucleotide sequences. There is also a section dedicated to MySQL with tips and tricks on restoring previous versions of the database, along with useful examples on forming complex queries.

Get in touch

Follow our new Twitter account RfamDB to be the first to find out about new Rfam families and don’t hesitate to raise a GitHub issue or email us if you have any questions.

You can also meet the Rfam team in person at a hands-on tutorial at the upcoming ECCB 2018 conference in Athens.

Genome-centric Rfam is finally here!

September 15, 2017

rfam-13.0We are pleased to announce the release of Rfam 13.0, the first major update since Rfam 12.0 went live in 2014. In this version we introduce a new genome-centric sequence database composed of non-redundant, representative, and complete genomes, as well as new website features, such as an updated text search.

Find out more about Rfam 13.0 in the NAR paper by Kalvari et al.: Rfam 13.0: shifting to a genome-centric resource for non-coding RNA families.

Rfam 12.3 is out

June 29, 2017


The new Rfam release (version 12.3) features 101 new families, unified search, and updated documentation.

New families

Rfam 12.3 featured families

In this release 101 new families were added to the database, including over a dozen Yersinia pseudotuberculosis RNA thermometers from a recent PNAS paper by Righetti et al. We would like to thank Zasha Weinberg for contributing NiCo riboswitch, Type-P5 Twister, and several RAGATH RNAs (for example, RAGATH-5). You can browse the new families here.

Unified text search

Rfam text search

Over the years Rfam developed many specialised ways of searching and exploring the data, such as Keyword search, Taxonomy search, browsing entries by type, and “Jump To” navigation. While these options work well, they may be confusing for new users, so we set out to unify all search functionality in a single text search.

The new search is available on the Rfam homepage or at the top of any Rfam page and is powered by EBI search. It allows to browse RNA families, clans, motifs, or explore Rfam by category using facets. For example, one can view families with 3D structures or view all snoRNA families that match human sequences, and the URLs can be bookmarked or shared.

The new search is a full replacement for the old search functionality except for taxonomy, because the new search can find species but not higher-level taxa. For example, one can search for Homo sapiens but not for Mammals. Stay tuned for future updates and use the old Taxonomy search in the meantime. We plan to retire all old search functionality once the new search is fully developed but until then the old and the new searches will coexist.

For more information about the new search, see Rfam documentation. If you have any feedback, please let us know in the comments below, on GitHub, by email, or on Twitter.

New home for Rfam documentation

Rfam help has been migrated to a dedicated documentation hosting platform ReadTheDocs and is now available at

Rfam ReadTheDocs help

The new system offers several advantages:

The source code of the documentation is available on GitHub so if you notice a problem you can let us know by creating an issue or help us fix it by editing the text on GitHub and sending a pull request.

Other updates

  • Clan competition for PDB entries: Now the 3D structure tab, the public MySQL database, and the FTP archive show only the lowest E-value match when several RNA families from the same clan match a PDB chain. For example, chain 0 of PDB structure 1S72 (LSU rRNA from an Archaeon Haloarcula marismortui) now matches only the Archaeal LSU family instead of all families from rRNA LSU clan.
  • New 5S rRNA clan CL00113 that includes 5S rRNA and mtPerm-5S families.

What’s next

This release will be the last “point release” for Rfam 12. In the next few months we will release Rfam 13.0 which will be based on a new sequence database. Previously, Rfam annotated WGS and STD subsets of ENA, which grow very quickly and include many redundant sequences. We will take advantage of reference genomes from UniProt reference proteome collection which is a regularly updated, reduced-redundancy set of reference genomes. This allows us to perform meaningful taxonomic comparisons and explore RNA families by taxonomy without sifting through thousands of versions of the same genome.

Get in touch

As always, we welcome comments and feedback about Rfam, so feel free to get in touch by email or by submitting a new GitHub issue.

Rfam 12.2 is live

January 25, 2017

We are happy to announce a new release of Rfam (version 12.2) which includes 115 new families, introduces R-scape secondary structure visualisations, and restores missing families to multiple Rfam clans.

New families

This release adds 115 new Rfam families bringing the total number of families to 2,588. Notable additions include Pistol, Hatchet, Twister-sister and several other riboswitches contributed by Zasha Weinberg. We are always looking for new RNA families, so please feel free to get in touch with your suggestions.

Testing covariation with R-scape

R-scape is a new method for testing whether covariation analysis supports the presence of a conserved RNA secondary structure. In order to check the quality of Rfam structures, we ran R-scape on all Rfam seed alignments and added R-scape visualisations to the secondary structure galleries. For example, here is R-scape analysis of the SAM riboswitch:


According to R-scape, the secondary structure from the Rfam seed alignment, shown on the left, has 19 statistically significant basepairs (highlighted in green). R-scape can also use statistically significant basepairs as constraints to predict a new secondary structure that is consistent with the seed alignment. Using this approach, R-scape increased the number of statistically significant basepairs from 19 to 27 while also adding 9 new basepairs that are consistent with the seed alignment (structure on the right). This visualisation gives an idea about the quality of the Rfam structure and indicates that in this case it may need to be updated. To find out more about R-scape have a look at a recent paper by Rivas et al.

Tip: R-scape visualisations are interactive, so you can pan and zoom the structures and get additional information by hovering over nucleotides and basepairs.

R-scape analysis suggests that many existing Rfam secondary structures can be improved (for example, FMN riboswitch or 5S rRNA). In other families secondary structures are not supported by the R-scape covariation analysis (for example, oxyS RNA) which indicates that either their seed alignments need to be expanded or that these RNA families do not have a conserved secondary structure. Lastly, there are also cases where the R-scape structures do not show significant improvement compared to the current secondary structure (for instance, Metazoa SRP).

In future releases we will begin to improve existing Rfam seed alignments by using R-scape in the family building pipeline. In the meantime, Rfam users can get an indication of the quality of the structure using R-scape visualisations.

Recovering lost clan members

Since Rfam 10.0, related Rfam families have been organised into clans. The clans are manually curated and clan membership is checked using automated quality control steps (for example, to make sure that a family cannot belong to more than one clan). However, under certain circumstances these quality control procedures silently removed families from the clans. This bug was introduced in Rfam 11.0, and over time, more than 30 families were dropped from 20 clans, so that some clans did not have any families at all. The problem has now been fixed and proper clan membership has been restored using Rfam releases from the FTP archive. You can explore Rfam clans and let us know if you have any feedback.

Other updates

How to access the data

In addition to the Rfam website, you can access the data in the FTP archive and via the API. There is also a public MySQL database introduced in the last release.

What’s next

As well as revisiting Rfam seed alignments, work is underway on the next major Rfam release (13.0) which will be based on a new sequence database built from complete genomes. We plan to make the new data available in late 2017.

Get in touch

We always welcome comments and feedback about Rfam, so feel free to get in touch by email or by submitting a new GitHub issue.