We are pleased to announce the release of Rfam 12.0!
This release contains some major changes when compared with previous releases of Rfam, so please take a minute to read our release notes. Rfam 12.0 is the first version of Rfam which is based on Infernal 1.1, and as such contains many significant changes. In particular, the curator-defined thresholds have all been manually altered to ensure compatability with Infernal 1.1. This means that many families have seen significant increases or decreases in family membership; hopefully the rethresholding, combined with Infernal 1.1 has resulted in better discrimination between true family members and negative matches. We have also added 329 new families, most of which are sRNAs. Important new families include those in the LSU clan: RF02540, RF02541, RF02542 and RF02543. Under the hood, we also have a shiny new production pipeline, which will hopefully lead to more frequent releases in future.
Big data, big changes
Another major new change concerns the full alignments. For some of our largest families the full alignments have grown over recent releases to the point where they’ve now become too unwieldly for many uses. For example, for one of the largest families, the bacterial small ribosomal subunit (RF00177), the full alignment contains around 3.7 million and occupies several gigabytes of disk space. Serving that file in a timely fashion via the website is simply impossible. Furthermore, where our old pipeline generated full alignments as part of the release process, the new pipeline doesn’t need this step, so full alignments are no longer necessary, which helps streamline our pipeline and data management processes nicely.
After lengthy consideration, we have decided no longer to supply the full alignments through the website. You can, however, still download the sequences and covariance models to generate your own alignment. As always, feel free to contact us if you need help accessing our data.
Retirement of rfam_scan.pl
We are also announcing the retirement of rfam_scan.pl. The search functionality that rfam_scan.pl provided has now been implemented in cmscan, which is provided as part of the Infernal suite of software. You can download the curated covariance models for each family from the family-specific webpage, or all of the models at once from our FTP site. Rfam_scan.pl implemented the BLAST pre-filters which were previously used to reduce the size of the search database prior to searching with the covariance models; these have been replaced by internal HMM filters in Infernal 1.1
As always, we’d appreciate your feedback!
Posted by Sarah.