Rfam now available in UCSC Genome Browser, and other genome news.

November 2, 2011

We are pleased to announce the arrival of the Rfam Track Hub for the popular UCSC Genome browser. Rfam data has been available in the Ensembl browser for some time and provides links back to the Rfam annotation, and now this same functionality is available for the UCSC Genome Browser.

The hub file is available on our ftp site, and by following the instructions at the UCSC Genome Browser Custom Hub page, you can visualise Rfam annotations for the majority of species for which genomes are provided by the UCSC Genome Browser. Clicking on a match will give you exact start and stop positions, as well as links to the Rfam annotation page here at the Sanger. At the moment, bit scores or E-values for a given match aren’t yet available directly through the UCSC Genome Browser, though we’re working on it. Happy browsing!

Rfam types for Genome annotation

Xfam (in the forms of Sarah and Rob) attended the NIH Genome Annotation Workshop last week, and it was a great insight into the trials and tribulations of coming up with common standards that everyone’s happy with. It was also nice to hear that Rfam is being used exensively to annotate ncRNA features. However, there’s been some confusion amongst annotators when converting between Rfam types (such as CD-Box) and the ncRNA_classes required by INSDC under the ncRNA feature key. The ncRNA feature key is intended to describe non-coding RNAs that aren’t ribosomal or transfer RNAs; these use the rRNA and tRNA feature keys respectively.

To use the ncRNA feature key, annotators are required to supply an appropriate ncRNA_class, and this is where confusion arises, as there’s no perfect overlap between the Rfam entry types and the ncRNA classes. To reduce this, here at Rfam we’ve put together a handy translation guide to make it easy to know what ncRNA class you should apply if you are using an Rfam family to annotate a genome. There are also some cases where an INSDC type is more specific than the Rfam type; for example, we don’t have a specific telomerase RNA type, whereas there is a ncRNA_class called telomerase_RNA. Therefore any annotation to RF00025 can use the telomerase_RNA ncRNA_class category. You can find our table of Rfam types and their INSDC equivalents here.

You can also find out all you ever wanted to know about the feature tables used for genome annotation here, and here.

Advertisements

2 Responses to “Rfam now available in UCSC Genome Browser, and other genome news.”

  1. Mathias Walter Says:

    Maybe you can add a column ‘INSDC_type’ of type enum to the rfam table and fill the mapping there.
    An extra line in the cm file would help rfam_scan.pl to parse the INSDC type and add it to the GFF attribute line.

  2. J. Alves Says:

    The link for the table of Rfam types and their INSDC equivalents is obsolete. Well, actually the table was not copied to the current release’s directory (did it need an update? hopefully not…).

    Anyway, the location where I found the table is:

    ftp://ftp.sanger.ac.uk/pub/databases/Rfam/10.1/misc/Rfam2INSDC.types


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s