Getting all Pfam-A domains for a proteome

June 21, 2012

We’ve had a few helpdesk tickets in the last few months asking how to download all of the Pfam-A domains for a particular species. This information can be quite difficult to obtain: getting it requires either downloading and installing a sub-set of the tables in our MySQL database, or else searching all of the sequences from the species of interest against Pfam, probably using our batch search.

We thought it would be useful to simplify the process and add the domain information directly to our proteome pages, so we’ve just done exactly that.

If you go to the proteome page for a particular species, for example Plasmodium falciparum, and click on the ‘Domain Composition‘ tab, you’ll now find a link above the table that will let you download a text file with the list of all regions for that proteome. We’ve only added these links in the Pfam website at Sanger so far but they’ll appear in the other Pfam sites soon. The data files are all available directly from our FTP site too, indexed by NCBI taxonomy ID.

We hope you’ll find this feature useful.

Posted by Jaina and John.

This entry was posted on June 21, 2012 at 1:49 pm and is filed under Pfam.

Tags: pfam, proteome, website

6 Responses to “Getting all Pfam-A domains for a proteome”

Paul Says:

June 25, 2012 at 4:50 am
Don’t you really want to use a BioMart for this sort of thing?

Reply
- johntate Says:
  
  July 5, 2012 at 10:07 am
  We do (and we will) use a biomart to provide exactly this sort of data. Creating a biomart for Pfam is high on our priority list. We’ll be sure to let you know when we have one available.
  
  Reply
  - ppgardne Says:
    
    July 5, 2012 at 11:10 pm
    Sounds promising!
Khader Shameer, Ph.D (@kshameer) Says:

July 24, 2012 at 4:56 pm
This is a great feature that was missing from Pfam. Thanks a lot for adding this. Definitely useful.

Reply
Niall Says:

August 2, 2012 at 5:15 pm
Is there a version using the complete proteomes from uniprot i.e. leaving out the trEmbl stuff and just including the reviewed proteins?

Reply
jainamistry Says:

August 3, 2012 at 12:38 pm
Dear Niall,

The file contains all proteins in a proteome, including those from TrEMBL. As John said above, our longer-term plan is to have a BioMart to serve this sort of data. When we have this in place, you will be able to select the subset of proteins in a proteome that are from the Swiss-Prot section of UniProt.

Jaina

Reply

Xfam Blog

Pages

Twitter

Related blogs

Recent Posts

Archives

Categories

Meta