Posts Tagged ‘rfam’

Rfam Release 14.9

November 15, 2022

We are happy to announce the release of Rfam 14.9. This release features 14 new miRNA families, 23 updated miRNA families, 10 families improved with their first 3D structure, 10 families updated with additional 3D structures, and comprehensive improvements using R-scape. Read on for more details.

Updated and new miRNA families

In this release, we have updated 23 miRNA families in Rfam and created 14 new families based on miRBase miRNA alignments. We estimate that this project is 80 percent finished. The remaining families are undergoing extensive curation.

New miRNA Families:

Rfam IDFamily
RF04223MIR2619
RF04224mir-9229
RF04225MIR7502
RF04226mir-9186
RF04227mir-9215
RF04228MIR6140
RF04229mir-9279
RF04230mir-9261
RF04231mir-9191
RF04232mir-9318
RF04233mir-680
RF04234mir-1421
RF04235mir-242_2
RF04236mir-1285

Updated miRNA Families:

Rfam IDFamily
RF00241mir-8/mir-141/mir-200
RF00456mir-34
RF00661mir-31
RF00672mir-190
RF00700mir-375
RF00702mir-182
RF00706mir-263
RF00713mir-239
RF00716mir-3
RF00717mir-315
RF00726mir-87
RF00727microRNA bantam
RF00728mir-81
RF00762mir-412
RF00837mir-251
RF00844mir-67
RF00848mir-61
RF00948mir-996
RF01045mir-544
RF01413miR-430
RF01924mir-2774
RF04088MIR812
RF04195MIR6217

3D families

We continue to review and update Rfam families with available 3D information, and in release 14.9 we include 10 new families updated with 3D information and we added additional 3D structures to 10 families. We have added 9 pseudoknots to the 10 families with new 3D structures.

Families with new 3D structures added:

Rfam IDFamilyNew
RF00037Iron response element I3SNP_C, 3SNP_D
RF00522PreQ1 riboswitch2L1V_A
RF01073Gag/pol translational readthrough site2LC8_A
RF01727SAM/SAH riboswitch6HAG_A
RF02253Iron response element II3SN2_B
RF02519ToxI antitoxin4ATO_G
RF02553Y RNA-like6CU1_A
RF02796Pab160 RNA3HJW_D, 3LWO_D, 3LWP_D, 3LWQ_D, 3LWR_D, 3LWV_D
RF03054Xanthine riboswitch/NMT1 RNA7ELP_A, 7ELP_B, 7ELQ_A, 7ELQ_A, 7ELR_A, 7ELR_B, 7ELS_A, 7ELS_B
RF04222Potato leafroll virus exoribonuclease-resistant RNA7JJU_A, 7JJU_B

Families with additional 3D structures:

Rfam IDFamilyUpdate
RF00015U4 spliceosomal RNA5GAP_V
RF00025Ciliate telomerase RNA7LMA_B, 7LMB_B
RF00050FMN riboswitch (RFN element)6WJR_X, 6WJS_X
RF00059TPP riboswitch (THI element)7TD7_A, 7TDA_A, 7TDB_A, 7TDC_A, 7TZR_X, 7TZR_Y, 7TZS_X, 7TZS_Y, 7TZT_A, 7TZU_A
RF00162SAM riboswitch (S box leader)7EAF_A
RF00174Cobalamin riboswitch6VMY_A
RF00442Guanidine-I riboswitch5U3G_B, 7MLW_F
RF01763ykkC-III Guanidine-III riboswitch5NWQ_B, 5NY8_A, 5NY8_B, 5NZ3_A, 5NZ3_B, 5NZD_A, 5NZD_B, 5O62_A, 5O62_B, 5O69_A, 5O69_B
RF01831THF riboswitch6Q57_A, 7KD1_A
RF02680PreQ1-III riboswitch6XKN_A, 6XKO_A

We have also created rfam.org/3d which contains a table of all families with 3D structures and a link to download the seed alignments for those families. Additionally, there is a new file on our FTP site Rfam.3d.seed.gz which contains all seed alignments for these families. The page and file will be updated each release. Please reach out if you have any suggestions for improvements!

Families updated with R-scape model

We worked with Elena Rivas to analyse all Rfam families with R-scape. We then updated models where R-scape was able to suggest a better alignment. This led to 26 families with improvements, listed below by number of additional covarying base pairs.

Additional Covaring basepairsRfam IDFamilyAdditional Covaring basepairsRfam IDFamily
24RF02033HEARO2RF01731TwoAYGGAY
14RF03065IS605-orfB-I2RF01794sok
8RF03068RT-32RF02221sRNA-Xcc1
5RF03072raiA2RF02947cow-rumen-2
4RF02969DUF3800-I2RF03000LOOT
3RF01688Actino-pnp2RF03158L31-Actinobacteria
3RF02004group-II-D1D4-51RF01864plasmodium_snoR21
3RF02005group-II-D1D4-61RF01867CC2171
3RF02913pemK1RF02944c4-2
3RF03077RT-21RF02968DUF3800-IX
3RF03135L4-Archaeoglobi1RF02987GA-cis
3RF03144eL15-Euryarchaeota1RF03019RT-16
2RF00062HgcC1RF03046Pseudomonadales-1

Other updates

We have also updated 3 other families. We updated  Sarbecovirus 5’ UTR (RF03120) secondary structure to reflect the pairing from Correlated sequence signatures are present within the genomic 5′UTR RNA and NSP1 protein in coronaviruses. We modified the consensus secondary structure  of the stem loop 1 of the 5’ untranslated region of the family to reflect the secondary structure in that paper. Additionally we renamed RF03054 and RF03071 families, which were first reported by Zasha Weinberg in a comparative analysis of intergenic regions in bacteria.

The Xanthine riboswitch (RF03054) was first reported in Proteobacteria as NMT1 non-coding RNA (ncRNA). Later it was reported as a ncRNA that recognised Xanthine and the structure was reported in 7ELP, 7ELQ, 7ELR and 7ELS PDBs which have been added as part of the seed alignment.

The Na+ riboswitch (RF03071) was first reported by Zasha Weinberg and called DUF1646 RNA. More recently, Neil White from the Breaker group identified it as a riboswitch that selectively senses Na+ and  regulates the expression of genes related to the sodium biology.

Migrating Rfam’s public SVN repository

Rfam and Pfam used to provide a public copy of their SVN repositories on xfamsvn.ac.uk. With the recent depreciation of Pfam’s website and inclusion as part of InterPro, we have decided to move the Rfam svn repository to http://svn.rfam.org. Users interested in a nightly updated version of Rfam can browse the repository at this new location.

Rfam release 14.8

May 30, 2022

We are happy to announce the release of Rfam version 14.8. This release includes 48 updated and 25 new microRNA families; 10 families updated based on 3D structure annotations; 4 new families and updates to 5 existing families for Hepatitis C virus; a new xRNAs family from the Potato virus; and the integration of LitScan, a literature scanner powered by RNAcentral and Europe PubMed Central. Read on for details on these changes.

Updated microRNA families

Rfam, miRBase and RNAcentral have been working to synchronize miRNA families between all three resources. We are now happy to report that we have completed 77% of the current miRNAs families covering >1300 miRNAs of 1700 alignments provided to us by Profesor Sam Griffiths-Jones at miRBase. The 400 remaining families need an extended review process, and we will be working on their integration in future releases.

Summary of the miRBase and Rfam synchronisation project, we estimate it is 78% completed.

Families updated with 3D structure information

Rfam has been updating families using 3D structure information. This project aims to improve Rfam families through the addition of pseudoknotes, base pairs, and annotations of other structural elements by inspecting 3D structures. In this release we have updated 10 families:

  • Virus families:
    • RF00507-Coronavirus frameshifting stimulation element
    • RF01047-HBV RNA encapsidation signal epsilon
  • Riboswitches families:
    • RF01763-Guanidine-III riboswitch, also know as ykkC-III riboswitch
    • RF01734-Fluoride riboswitch
    • RF01704-Glutamine-II riboswitch, previously known as Downstream peptide
    • RF01750-ZMP/ZTP riboswitch
    • RF01739-Glutamine riboswitch
    • RF02683-NiCo riboswitch
    • RF01826-SAM-V riboswitch
  • Ribozyme family:

We have added pseudoknots to 7 of the 10 updated families and the updated secondary structure diagrams from 5 of these families are shown below.

Examples of families reviewed and updated with 3D information. Pseudoknot structures (pk) were added to each of these five families based on a review of the corresponding 3D structures.

New and updated families of Hepatitis C virus

In release 14.8 we have created 4 new families, updated 5 existing families, and deleted 4 virus families. These changes are the result of our ongoing collaboration between Profesor Manja Marz of the European Virus Bioinformatics Center and Rfam. The Marz group provided Rfam a curated alignment of representative sequences for the entire genome of Hepatitis C virus genome. We used this alignment to update, create or remove existing Rfam families. The new families we have created are summarized in a table below. We have deleted RF00469, which was merged into RF00260 during review. We have also deleted families from RF02585 to RF02588 which have no support in the genomic alignment. 

Rfam IDNameDescription
RF00061IRES_HCVHepatitis C virus internal ribosome entry site
RF00260HepC_CREHepatitis C virus (HCV) cis-acting replication element (CRE)
RF00620HCV_ARF_SLHepatitis C alternative reading frame stem-loop
RF00468HCV_SLVIIHepatitis C virus stem-loop VII
RF00481HCV_X3Hepatitis C virus 3’X element
RF04218HCV_5BSL1Hepatitis C virus stem-loop I
RF04219HCV_J750J750 non-coding RNA (containing SL761 and SL783)
RF04220HCV_SL588SL588 non-coding RNA
RF04221HCV_SL669SL669 non-coding RNA

As part of this project we have reviewed and updated Coronavirus, Flavivirus and HCV viruses families, and we are working on adding RNA families from other viruses, such as Filoviridae (e.g. Ebolavirus) and Rhabdoviridae (e.g. Rabies viruses).

xRNAs in Potato virus

We want to thank Professor Quentin Vicens for sharing the alignment of Potato leafroll virus exoribonuclease-resistant RNA (PLRV-xrRNA). PLRV-xrRNA is a non-coding RNA that blocks the progression of 5′ to 3′ exoribonuclease using only a folded RNA element, and this family is described in RF04222.

LitScan

RNAcentral has recently developed LitScan, a tool to automatically connect non-coding RNA sequences, genes and families to the literature that discusses them. In this release we have integrated the LitScan widget into Rfam. The widget is now shown in the new ‘Publications’ tab on all Rfam families.

Example of LitScan for mir-17 microRNA precursor family, publications can be sorted by citation, journal, year of publication and others.

Please reach out to us with feedback on the widget, or if you would like to use the LitScan widget on your site!

Rfam release 14.7

December 21, 2021

We are happy to announce the latest Rfam release, version 14.7. The release includes 121 updated microRNA families, 4 new families, and a redesigned Rfam-PDB mapping pipeline that provides weekly updates as new RNA 3D structures become available. Read on to find out more or explore the data in Rfam.

Updated microRNA families

As part of the Rfam-miRBase synchronisation project discussed in the Rfam 14 paper, we continue revising microRNA families in Rfam using the data provided by miRBase. This release includes 121 updated families, such as mir-6 (RF00143) and mir-22 (RF00653). The following five microRNA families have been deleted from Rfam as the corresponding entries were removed from miRBase due to lack of evidence: mir-1937 (RF01942), mir-1280 (RF02013), mir-353 (RF00800), mir-720 (RF02002), and mir-2973 (RF02096). We would like to thank Lisanne Knol (University of Edinburgh) for bringing the first two of these families to our attention.

We estimate that the Rfam is now approximately 60% in sync with miRBase, with additional families to be released in future versions of Rfam. You can view the full list of updated families here or browse all microRNAs in Rfam

New families

The release includes two new hairpin ribozyme families Hairpin-meta1 (RF04190) and Hairpin-meta2 (RF04191) recently reported by the Weinberg lab. The hairpin ribozymes were discovered in metatranscriptome data and are proposed to occur in circular RNA genomes of as yet uncharacterised organisms. The new family joins the original Hairpin ribozyme family (RF00173).

Based on a recent paper we also created two additional bacterial families, the icd-II ncRNA motif (RF04189) and the carA ncRNA motif (RF04192). We would like to thank Ken Brewer (Yale University) for providing the data.

Weekly updates of PDB structures matching Rfam families

For many years Rfam maintained a mapping between Rfam families and experimentally determined RNA 3D structures available in PDB. However, this mapping lagged behind the weekly PDB updates as it was only updated with Rfam releases. 

The newly implemented pipeline analyses the data every week and makes the data available on the Rfam website and in a new section of the FTP archive that contains a preview of the upcoming release. The new, up-to-date mapping also improves the ability to search PDBe using Rfam and is a key part of an ongoing project to review all Rfam families with known 3D structures. 

Currently there are 127 Rfam families with experimentally determined RNA 3D structures in the PDB. For example, a recent paper describing how a viral RNA hijacks host machinery produced several structural models (7SAM, 7SC6, and 7SCQ) showing the pseudoknot of tRNA-like structure that are now mapped to Rfam family RF01084. Follow Rfam on Twitter to be the first to hear when new RNA families are linked to 3D structures.

Other improvements

  • We continue improving the Gene Ontology (GO) terms associated with Rfam families, and in this release 412 families have been updated to use the latest GO terms. Maintaining the GO terms up-to-date is important as Rfam is used for automatic assignment of GO terms in RNAcentral and other resources. The GO terms are shown in the Curation tab of each family and are also available in the rfam2go file. 
  • The Rfam.seed_tree.tar.gz file hosted on the FTP archive has been fixed. We would like to thank Christian Anthon (University of Copenhagen) for reporting the problem.

Get in touch

As always, we look forward to hearing from you if you have any feedback or suggestions for Rfam. Please feel free to email us or get in touch on Twitter.

Happy holidays from the Rfam team!

This is the first release produced by Emma Cooke and Blake Sweeney who have joined Rfam in the second half of 2021. The Rfam team wishes you a happy holiday season! We look forward to creating lots more families in 2022 and working towards the next major Rfam release, Rfam 15.0!

Welcome Blake as the new RNA Resources Project Leader

September 13, 2021

RNAcentral and Rfam recently completed the search for a project leader to succeed Anton Petrov and Blake Sweeney has been appointed and recently started in his new role. Some of you may know Blake as the current RNAcentral bioinformatician where he has been running the bioinformatic pipeline and speaking at conferences. With a PhD in RNA bioinformatics and a decade of experience developing RNA databases, including 4.5 years at RNAcentral, Blake is perfectly positioned to take the projects forward. The official handover date is in May, but until then Anton and Blake will be working together to ensure a smooth transition. 


Additionally, we are now hiring a new RNAcentral bioinformatician. If you are interested in applying please see: https://bit.ly/38L86Xe. If you have any questions or comments about the RNA resources please contact Blake Sweeney at bsweeney@ebi.ac.uk.

Rfam 14.6 is out

July 27, 2021

We are happy to announce a new release of Rfam (14.6) that includes 121 new microRNA families, a new ribozyme family, 8 new small RNA families found in Bacteroides, as well as 10 additional families with updated secondary structures using 3D structural information. Read on for more information or explore the data in Rfam.

New microRNA families

The new release includes 121 new microRNA families bringing the total number of microRNA families in Rfam to 1,506. This work is part of the ongoing collaboration with miRBase that aims to synchronise microRNAs across miRBase, Rfam, and RNAcentral. Browse Rfam microRNAs or find out more about the microRNA project.

We also resolved an issue with 6 microRNA families that were missing a covariance model on the website and in the FTP archive. Many thanks to Dr Christian Anthon (University of Copenhagen) for pointing out this problem!

Updating families using information from 3D structures

Following on from Rfam 14.5, we updated the secondary structure of 10 additional families with 3D information, including 6 riboswitches, 1 ribozyme, 1 telomerase, 1 localization element and 1 microRNA precursor.

In some families, the updated structure is substantially changed. For example, the central part of the flavin mononucleotide (FMN) riboswitch is now organised by several additional base pairs and two pseudoknots (pK). As a result, the updated structure is more compact and more accurately reflects the experimentally determined 3D structures.

Seven of the updated families include newly annotated pseudoknots, which is an important improvement that helps better model long-distance non-nested interactions. We will continue reviewing and updating the families with 3D structure in future releases. The full list of the updated families can be found in the table below.

FamilyPDB structuresNew  pK
RF00008 – Hammerhead ribozyme (type III)2QUS_A, 2QUS_B, 2QUW_A, 2QUW_B, 2QUW_C, 2QUW_D, 5DI2_A, 5DI4_A, 5DQK_A, 5EAO_A, 5EAQ_A1
RF00025 – Ciliate telomerase RNA6D6V2
RF00050 – FMN riboswitch (RFN element)3F2Q, 3F2T, 3F2W, 3F2X, 3F2Y, 3F3O 2
RF00059 – TPP riboswitch (THI element)2CKY_A, 2CKY_B, 2GDI_X, 2GDI_Y, 2HOJ_A, 2HOK_A, 2HOL_A, 2HOM_A, 2HOO_A, 2HOP_A, 3D2G_A, 3D2G_B, 3D2V_A, 3D2V_B, 3D2X_A, 3D2X_B, 3K0J_E, 3K0J_F, 4NYA_A, 4NYA_B, 4NYB_A, 4NYC_A, 4NYD_A, 4NYG_A
RF00207 – K10 transport/localisation element (TLS)2KE6, 2KUR, 2KUU, 2KU, 2KUW
RF00174 – Cobalamin riboswitch4GMA, 4GXY1
RF00380 – ykoK leader / M-box riboswitch2QBZ_X, 3PDR_X, 3PDR_A1
RF01689 – AdoCbl variant RNA4FRN_A, 4FRN_B, 4FRG_X, 4FRG_B1
RF01831 – THF riboswitch3SD3, 3SUH, 3SUX, 3SUY, 4LVV, 4LVW, 4LVX, 4LVY, 4LVZ, 4LW01
RF02095 – mir-2985-2 microRNA precursor2L3J

Hovlinc ribozyme

A recent paper by Chen Y et al. 2021 describes Hovlinc, a new type of self-cleaving ribozymes found in human and other hominids. Hovlinc was detected in a very long intergenic noncoding RNA in hominids (hominin vlincRNA-located) using a genome-wide approach designed to discover self-cleaving ribozymes. The functions of vlincRNA and the hovlinc ribozyme remain unclear. Hovlinc joins 3 known classes of small, self cleaving ribozymes found in human: (1) Mammalian CPEB3 ribozyme, (2) Hammerhead ribozyme and (3) B2 and ALU retrotransposons. We would like to thank Dr Fei Qi (Huaqiao University) for providing the Hovlinc alignment. View hovlinc family in Rfam.

New Bacteroides families

In a recent article by Ryan et al. 2020, the authors report a high-resolution transcriptome map of the model organism Bacteroides thetaiotaomicron, a common bacteria of the human gut. They recognize 269 non-coding RNAs (ncRNAs) candidates from which nine were validated. Eight of these ncRNAs were integrated as new families:

  1. RF04177 – Bacteroides sRNA BTnc201
  2. RF04178 – Bacteroides sRNA BTnc005
  3. RF04179 – Bacteroides sRNA BTnc049
  4. RF04180 – Bacteroides sRNA BTnc231
  5. RF04181 – rteR sRNA
  6. RF04182 – GibS sRNA
  7. RF04183 – Bacteroidales small SRP
  8. RF04184 – Bacteroides sRNA BTnc060

In addition, the RF01693 – Bacteroidales-1 family was renamed to 6S-Bacteroidales RNA. Bacteroidales-1 was first reported in a comparative genomics-based approach of genome and metagenome sequences from Weinberg et al. 2010. It was identified downstream of L20 ribosomal subunit genes in the order Bacteroidales. Ryan et al. 2020 report that this sRNA is a 6S RNA homolog in Bacteroides thetaiotaomicron. Rfam also has other two families of 6S RNA RF00013-6S/SsrS RNA and RF01685-6S-Flavo. We would like to thank Dr Lars Barquist (University of Würzburg) for providing the data.

Welcome to Emma!

A few weeks ago Emma Cooke joined the Rfam team as Software Developer and is already busy working on new features. Emma has studied Genetics and Software Engineering, and her previous roles have focused on release verification pipelines, software testing, and developing for cloud environments. Please join us in welcoming Emma to the Rfam community and stay tuned for new announcements based on her work.

Get in touch

As always, we would be very happy to hear from you if you have any feedback or suggestions for Rfam. Please feel free to email us or get in touch on Twitter.

Join Rfam team!

March 19, 2021

We are looking for a Software Developer to join the Rfam team and contribute to the world’s largest database of RNA families. The post holder will be responsible for keeping Rfam up-to-date, developing Rfam Cloud, and improving the website. More information about the position can be found at https://bit.ly/rfam-software-developer

Apply now or help spread the word. Closing date: April 20th, 2021.

Rfam 14.5 is live

March 18, 2021

We are happy to announce a new Rfam release, version 14.5, featuring 112 updated microRNA families and 10 families improved using the 3D structure information. Read on for details or explore 3,940 RNA families at rfam.org.

Updated microRNA families

As described in our most recent paper, we are in the process of synchronising microRNA families between Rfam and miRBase. In this release 112 of the existing microRNA families have been updated with new manually curated seed alignments from miRBase, new gathering thresholds, and new family members found in the Rfamseq sequence database. 

In total, 852 new microRNA families have been created (356 in release 14.3 and 496 in release 14.4) and 152 existing families have been updated (40 in release 14.3 and 112 in release 14.5). As the miRBase-Rfam synchronisation is about 50% complete, additional microRNA families will be made available in the upcoming releases. You can view a list of the 112 updated families or browse all 1,385 microRNA families on the Rfam website. 

Updating families using information from 3D structures

We are also in the process of reviewing the families with the experimentally determined 3D structures in order to compare the Rfam annotations with the 3D models. Our goal is to incorporate the 3D information into Rfam seed alignments as many families have been created before the corresponding 3D structures became available. We manually review each PDB structure, verify basepair annotations from matching PDBs, and obtain a more consistent consensus secondary structure model. 

In multiple cases we were able to add missing base pairs and pseudoknots. For example, in the SAM riboswitch (RF00162), we added two base pairs in the base of helix P2, corrected a basepair in P3 and added four basepairs in P4 (one in the base of the helix and three near the terminal loop). The updated consensus secondary structure presents a more accurate central core annotation with more structure in the four-way junction.

SAM riboswitch secondary structure before and after the updates

In another example, one base pair was added in P1 and another one in P3 of the SAM-I/IV variant riboswitch (RF01725). We also corrected a base pair in P3 and included a P4 stem loop that was not integrated before.

SAM-I/IV riboswitch secondary structure before and after the updates

The SAM-I/IV riboswitch is characterised by a similar SAM binding core conformation to that of the SAM riboswitch but it differs in the k-turn motif in P2 which is found in SAM riboswitches but not in SAM-I/IV. These two families also have different pseudoknots interactions, where SAM riboswitch forms a pseudoknot between a P2 loop and the stem of P3, while the SAM-I/IV riboswitch contains a pseudoknot between a P3 loop and the 5′ region.

The first 10 families updated with 3D information include:

  1. RF00162 – SAM riboswitch
  2. RF01725 – SAM-I/IV variant riboswitch
  3. RF00164 – Coronavirus 3’ stem-loop II-like motif (s2m)
  4. RF00013 – 6S / SsrS RNAP
  5. RF00003 – U1 spliceosomal RNA
  6. RF00015 – U4 spliceosomal RNA
  7. RF00442 – Guanidine-I riboswitch
  8. RF00027 – let-7 microRNA precursor
  9. RF01054 – preQ1-II (pre queuosine) riboswitch and
  10. RF02680 – preQ1-III riboswitch

We will continue reviewing the families with known 3D structure in future releases.

Other family updates

Initially reported by Aspegren et al. 2004, Class I (RF01414) and Class II (RF01571) RNAs were found in social amoeba Dictyostelium discoideum and later on investigated in more detail by Avesson et al. 2011. Now a new report from Kjellin et al. 2021 presents a comprehensive analysis of the Class I RNA genes in dictyostelid social amoebas. Based on this study, we updated the Dicty Class I RNA family RF01414 with a new seed alignment and removed the family RF01571, thus merging both families into one. We thank Dr Jonas Kjellin (Uppsala University) for suggesting this update.

Goodbye Ioanna!

Rfam 14.5 is the last release prepared by Dr Ioanna Kalvari who will be leaving the team at the end of March 2021. We would like to take the opportunity to thank Ioanna for her contributions over the last 5.5 years and wish her best of luck in the future!

Get in touch

As always, we would be very happy to hear from you if you have any feedback or suggestions for Rfam. Please feel free to email us or get in touch on Twitter

Rfam 14.4 is live

December 18, 2020

The last Rfam release of 2020 is now live! Rfam 14.4 contains 496 new microRNA families developed in collaboration with miRBase. Find out about the microRNA project in our new NAR paper and let us know if you have any feedback.

Rfam 14.3

September 15, 2020

Rfam 14.3 includes 356 new and 40 updated microRNA families, as well as 12 new and 2 updated Flavivirus RNAs. Find out the details in our new NAR paper and get in touch if you have any feedback.

Rfam Coronavirus Special Release

April 27, 2020

In response to the SARS-CoV-2 outbreak, the Rfam team prepared a special release dedicated to the Coronavirus RNA families. The release 14.2 includes 10 new and 4 revised families that can be used to annotate the SARS-CoV-2 and other Coronavirus genomes with RNA families.

View the data at rfam.org/covid-19 ➡️

New Coronavirus Rfam families

In collaboration with the Marz group and the EVBC, we created 10 families representing the entire 5’- and 3’- untranslated regions (UTRs) for Alpha-, Beta-, Gamma-, and Delta- coronaviruses. A specialised set of alignments for the subgenus Sarbecovirus is also provided, including the SARS-CoV-1 and SARS-CoV-2 UTRs. 

The families are based on a set of high-quality whole genome alignments produced with LocARNA and reviewed by expert virologists. Note that the Alpha-, Beta-, and Deltacoronavirus alignments and structures were refined based on the literature, while the Gammacoronavirus families are based on prediction alone due to the lack of experimental data.


Virus
5’ UTR3’ UTR
AlphacoronavirusaCoV-5UTR
RF03116
aCoV-3UTR
RF03121
BetacoronavirusbCoV-5UTR
RF03117
bCoV-3UTR
RF03122
Sarbecovirus and SARS-CoV-2Sarbecovirus-5UTR
RF03120
Sarbecovirus-3UTR
RF03125
GammacoronavirusgCoV-5UTR
RF03118
gCoV-3UTR
RF03123
DeltacoronavirusdCoV-5UTR
RF03119
dCoV-3UTR
RF03124

Previously, only fragments of the UTRs were found in Rfam. In particular, two families were superseded by the new whole-UTR alignments and removed from Rfam:

  • RF00496 (Coronavirus SL-III cis-acting replication element): This family represented a single stem that is now found in aCoV-5UTR and bCoV-5UTR families.
  • RF02910 (Coronavirus_5p_sl_1_2): This family represented two stems from aCoV-5UTR.

The new families are grouped into 2 clans: CL00116 and CL00117 for the 5’ and 3’ UTRs, respectively. The clans can be used with the Infernalcmscan program to automatically select the highest scoring match from a set of related families (see the Rfam chapter in CPB to learn more).

Revised Coronavirus families

We also reviewed and updated the existing Coronavirus Rfam families.

FamilyWhat was updated?Is it found in SARS-CoV-2?
RF00182 Coronavirus packaging signal The seed alignment and consensus secondary structure were updated to include the 4 conserved repeat units. This RNA element isfound only in Embecovirus, so it is not present in SARS-CoV-2 and other Sarbecoviruses.
RF00507 Coronavirus frameshifting stimulation elementThe seed alignment was expanded. This RNA is present in SARS-CoV-2.
RF00164 Coronavirus s2m RNAThe seed alignment was expanded.

There is a 3D structure for SARS-CoV-1 which can be used for understanding the s2m in SARS-CoV-2.
This RNA is present in the 3′ UTR of SARS-CoV-2.
RF00165 Coronavirus 3’-UTR pseudoknot  The seed alignment was expanded. 
The pseudoknot is annotated in the 3’ UTR families but since it is mutually exclusive with the 3’-UTR consensus structure, it is also provided as a separate family. 
This RNA is present in the 3′ UTR of SARS-CoV-2.

Where to get the data 

You can download the covariance models, as well as seed alignments for the coronavirus families from the corresponding family pages or from a dedicated folder on the FTP archive.

How to use the data

You can download the covariance models and annotate viral sequences with these RNA models using Infernal. See Rfam help for examples.

Inviting all Wikipedians to contribute

We revised the Wikipedia pages associated with each family, and we invite everyone to contribute to the following articles:

Acknowledgements

We would like to thank Kevin Lamkiewicz and Manja Marz (Friedrich Schiller University Jena) for providing the curated alignments for the new families as well as Eric Nawrocki (NCBI) for revising the existing Rfam entries. We also thank Ramakanth Madhugiri (Justus Liebig University Giessen) for reviewing the Coronavirus UTR alignments.


This work is part of the BBSRC-funded project to expand the coverage of viral RNAs in Rfam. More data on SARS-CoV-2 can be found on the European COVID-19 Data Portal.