Website update

October 29, 2009

We’ve just updated the Pfam website again. This update comes fairly soon after the major, Pfam 24.0 release and it’s intended to fix some of the more annoying bugs and omissions that we’ve found in the last week or so. Read the rest of this entry »


Pfam release 24.0

October 13, 2009

We have just released the latest update to Pfam. Release 24.0 contains a total of 11,912 families, with 1,808 new families and 236 families killed since the last release. 75.15% of all proteins in Pfamseq contain a match to at least one Pfam domain. 53.18% of all residues in the sequence database fall within Pfam domains. Read the rest of this entry »


Imminent Release of Pfam 24.0

October 2, 2009

We are now on the brink of releasing Pfam 24.0.  This release of Pfam, version 24.0, will be a landmark release as it will be the first to be built using the the new version of the HMMER package, HMMER3. We are well aware that we have been claiming this release as imminent for some time, but we are now at the point of flicking the big switch.  There are numerous changes that users need to know about and we will briefly summarise them here. Read the rest of this entry »


pfam_scan.pl – part II

September 11, 2009

Back in May we wrote a blog post about the new version of pfam_scan.pl. We asked if there was anyone out there who was willing to help us test our new script, and we were pleasantly surprised at the number of people who got in contact with us – so a big thank you to all those who have helped. Since releasing the alpha version of pfam_scan.pl to our testers we have made some internal changes to the script that are worth mentioning: Read the rest of this entry »


Rfam help documentation and demise of the old website

June 15, 2009

We are proud to announce that we have added help documentation to the Rfam website. The new pages can be accessed via the ‘Help’ link in the header of every page in the site.

Read the rest of this entry »


pfam_scan.pl

May 21, 2009

We’re currently working on a new version of one of our core scripts, ‘pfam_scan.pl’. This script searches a set of protein sequences (in FASTA format) against Pfam’s library of HMMs. The original code was written nearly a decade ago but, since then, features have been added, bugs have been fixed and the code has evolved into something that is far from elegant. The re-write is something that we’ve been planning to do for a while and, as the code needs updating to use the new HMMER3 software, now seems like the perfect time to do it. Read the rest of this entry »


DUFs: families in need of function

April 20, 2009

Domains of Unknown Function, or DUFs, is a large set of families found in the Pfam database. Examples would be “DUF26” or “DUF282“. The DUF naming scheme was introduced by Chris Ponting, through the addition of DUF1 and DUF2 to the SMART database. These two domains were found to be widely distributed in bacterial signalling proteins. Subsequently, the functions of these domains were identified and they have since been renamed as the GGDEF and EAL domains respectively (structures shown in Figures 1 and 2). These families were added to Pfam in 1997, and little did Chris know that he was starting a trend that would see thousands of uncharacterised families being added to the domain databases. Read the rest of this entry »


HMMER3 migration: resolving overlaps

March 19, 2009

It has been a little quiet on the Pfam blog recently, but behind the scenes we’ve been working hard on the migration to HMMER3.

We have built HMMER3 models for all of the Pfam alignments, and searched them against the sequence database. This part was super quick, as HMMER3 is ~100 times faster than HMMER2. Due to the increased sensitivity of HMMER3, many of our Pfam families have grown in size, and we have found that ~80,000 sequences in the sequence database now have overlapping matches to more than one Pfam family.

Within Pfam we have a rule that states that our families should not overlap; this means that any one amino acid can belong to only a single Pfam family.  The exception to this rule applies to families within a clan – clans are Pfam’s collections of related families – where overlaps between clan members are allowed. Over the last few weeks we’ve been working through and resolving the list of 80,000 overlaps. Read the rest of this entry »


Rfam plans and Infernal 1.0

February 23, 2009

We recently held our annual Rfam Next Big Things (NBT®) meeting. This is the meeting where we decide what the big changes for our various projects will be. Since Sean asked for it (and I think it’s a good idea), I thought I would discuss the biggest NBT® here. This is Rfam adopting Infernal 1.0 for release 10.0.
Read the rest of this entry »


Rfam, RNA Biology and Wikipedia in the news

February 20, 2009

Some of you may have noticed the recent attention that the unholy alliance between Rfam, RNA Biology and Wikipedia has been receiving recently. I thought it might be worthwhile posting a more detailed overview of how this happened, what we’re planning and dealing with the major criticisms.
Read the rest of this entry »