We have recently produced a new release of AntiFam, release 3.0. AntiFam has grown in size, and release 3.0 contains 54 entries – compared to just 23 when we last blogged about AntiFam (release 1.1). Over 80 % of these new entries arise from translations of non-coding RNAs, including several families from translations of rRNA, tmRNA and RNaseP.
AntiFam has several new features, which we hope our users will find useful. In release 2.0 we introduced a DO line, this is the line in the AntiFam.seed file which begins “#=GF DO”. This line gives a recommended course of action to take when a predicted protein sequence is recognised by an AntiFam HMM. In most cases this line recommends that the predicted protein is deleted.
In release 3.0 we have introduced a TX line, the line in the AntiFam.seed file which begins “#=GF TX”. This is used to indicate which superkingdoms predicted proteins recognised by AntiFam HMMs have been found in. We have also produced superkingdom-specific sets of HMMs for this release. One AntiFam HMM may identify spurious proteins arising from multiple superkingdoms, and therefore there is some overlap between these sets. There are 18 entries in the Eukaryota set, 47 in the Bacteria set, 4 in the Archaea set, 4 in the Virus set and 18 in the unidentified set; the latter includes proteins from unclassified organisms, such as those from metagenomics studies.
AntiFam 3.0 is available to download from our ftp site on ftp://ftp.sanger.ac.uk/pub/databases/Pfam/AntiFam/
AntiFam is made freely available under the Creative commons Zero (CC0) licence http://creativecommons.org/publicdomain/zero/1.0/
We welcome any suggestions for new families, and can be contacted by email email@example.com
Posted by Ruth