We have recently produced a new release of AntiFam, release 3.0. AntiFam has grown in size, and release 3.0 contains 54 entries – compared to just 23 when we last blogged about AntiFam (release 1.1). Over 80 % of these new entries arise from translations of non-coding RNAs, including several families from translations of rRNA, tmRNA and RNaseP.
New features
AntiFam has several new features, which we hope our users will find useful. In release 2.0 we introduced a DO line, this is the line in the AntiFam.seed file which begins “#=GF DO”. This line gives a recommended course of action to take when a predicted protein sequence is recognised by an AntiFam HMM. In most cases this line recommends that the predicted protein is deleted.
In release 3.0 we have introduced a TX line, the line in the AntiFam.seed file which begins “#=GF TX”. This is used to indicate which superkingdoms predicted proteins recognised by AntiFam HMMs have been found in. We have also produced superkingdom-specific sets of HMMs for this release. One AntiFam HMM may identify spurious proteins arising from multiple superkingdoms, and therefore there is some overlap between these sets. There are 18 entries in the Eukaryota set, 47 in the Bacteria set, 4 in the Archaea set, 4 in the Virus set and 18 in the unidentified set; the latter includes proteins from unclassified organisms, such as those from metagenomics studies.
Availability
AntiFam 3.0 is available to download from our ftp site on ftp://ftp.sanger.ac.uk/pub/databases/Pfam/AntiFam/
AntiFam is made freely available under the Creative commons Zero (CC0) licence
http://creativecommons.org/publicdomain/zero/1.0/
We welcome any suggestions for new families, and can be contacted by email ruthe@ebi.ac.uk
Posted by Ruth
November 14, 2012 at 9:28 pm
Great job Ruth. AntiFam is a fantastic resource. Hopefully it gets widely used and all the predicted proteins smothering ncRNAs get cleaned up.
On a more cautionary note, there are legitimate cases of translated ncRNAs. For example, Tar1p which my esteemed colleagues Austen Ganley & Mark Walker at Massey Albany assure me is real — in spite of my intense skepticism. There is literature to support them too [1-4].
[1] Coelho PS, et al. (2002) A novel mitochondrial protein, Tar1p, is encoded on the antisense strand of the nuclear 25S rDNA. Genes Dev 16(21):2755-60
[2] Bonawitz ND, et al. (2008) Expression of the rDNA-encoded mitochondrial protein Tar1p is stringently controlled and responds differentially to mitochondrial respiratory demand and dysfunction. Curr Genet 54(2):83-94
[3] http://www.yeastgenome.org/cgi-bin/locus.fpl?locus=Tar1
[4] http://www.ncbi.nlm.nih.gov/nuccore/296146686?report=genbank
November 15, 2012 at 11:25 am
There’s Ribin too [5].
[5] Ribin, a protein encoded by a message complementary to rRNA, modulates ribosomal transcription and cell proliferation.
Kermekchiev M, Ivanova L.
Mol Cell Biol. 2001 Dec;21(24):8255-63.
http://www.ncbi.nlm.nih.gov/pubmed/11713263
December 10, 2012 at 7:27 pm
Another anti-AntiFam is ANF00030. For most of its seed proteins (the ones ending “AA”), the last 10-11 amino acids are the legitimate translation product of tmRNA.
December 10, 2012 at 8:19 pm
Good point Kelly, Thanks! There are definitely a few other Rfam families that are legitimately translated. The are worth checking too. If we had full length group I & II intron models then these would be a concern too (fortunately we don’t).
December 11, 2012 at 3:39 am
WordPress seems to have dropped the link I added:
http://en.wikipedia.org/wiki/NcRNA#Bifunctional_RNA