We’ve had a few helpdesk tickets in the last few months asking how to download all of the Pfam-A domains for a particular species. This information can be quite difficult to obtain: getting it requires either downloading and installing a sub-set of the tables in our MySQL database, or else searching all of the sequences from the species of interest against Pfam, probably using our batch search.
We thought it would be useful to simplify the process and add the domain information directly to our proteome pages, so we’ve just done exactly that.
If you go to the proteome page for a particular species, for example Plasmodium falciparum, and click on the ‘Domain Composition‘ tab, you’ll now find a link above the table that will let you download a text file with the list of all regions for that proteome. We’ve only added these links in the Pfam website at Sanger so far but they’ll appear in the other Pfam sites soon. The data files are all available directly from our FTP site too, indexed by NCBI taxonomy ID.
We hope you’ll find this feature useful.
Posted by Jaina and John.
June 25, 2012 at 4:50 am
Don’t you really want to use a BioMart for this sort of thing?
July 5, 2012 at 10:07 am
We do (and we will) use a biomart to provide exactly this sort of data. Creating a biomart for Pfam is high on our priority list. We’ll be sure to let you know when we have one available.
July 5, 2012 at 11:10 pm
Sounds promising!
July 24, 2012 at 4:56 pm
This is a great feature that was missing from Pfam. Thanks a lot for adding this. Definitely useful.
August 2, 2012 at 5:15 pm
Is there a version using the complete proteomes from uniprot i.e. leaving out the trEmbl stuff and just including the reviewed proteins?
August 3, 2012 at 12:38 pm
Dear Niall,
The file contains all proteins in a proteome, including those from TrEMBL. As John said above, our longer-term plan is to have a BioMart to serve this sort of data. When we have this in place, you will be able to select the subset of proteins in a proteome that are from the Swiss-Prot section of UniProt.
Jaina