A week or so ago we updated the Pfam website at Sanger. There are no major changes to the site, just bug fixes and general improvements, so you may not have noticed any particular difference in the site. Behind the scenes, however, we’ve made some fairly significant changes to our infrastructure, which will hopefully allow us to distribute the website easily to anyone who wants to run it locally.
The Pfam website has always been hosted on the web-server infrastructure that runs most of the websites within the Sanger Institute, a set of centrally managed server machines. Recently, however, Sanger has been moving towards using virtual machines to run services like ours.
Wikipedia has a good article on virtual machines (VMs) but, briefly, virtual machines are programs that behave to all intents and purposes like real, physical computers. VMs are easy to commission and, once configured and working, can be cloned, archived and deployed easily within a virtualisation environment.
A VM can be used to run more or less anything that a physical computer can run and are fairly well suited for use as web servers. Running Pfam within a VM or, in fact, a set of VMs, allows us better control over the servers themselves and the server processes, so that we can more easily fix problems or test updates. Most significantly, running Pfam as a virtual machines means that we should eventually be able to distribute the entire website as a self-contained package.
Although it’s always been possible to run the Pfam website at your own site, it’s previously been an exercise in frustration, requiring a lot of ancillary Perl modules and a lot of configuration. Once we’re able to package up the site as a VM, however, you should be able simply to download the machine image and import it into a virtualisation framework at your site. Many sites already have centrally managed virtualisation hardware, but there are also free solutions that you can run on any reasonably powerful PC.
As well as the web server itself, there are two other components of the Pfam website that you will need if you want to run it locally. Most importantly, you’ll need the Pfam database running within your local environment, which requires you to install and run MySQL and to import the Pfam database tables. If you want to run sequence searches on your own servers, you’ll also need to have the sequence search system running. This, like the website, has always been available for local installation, but it’s similarly difficult to install and recent changes have meant that the available documentation is woefully out of date. When we get to the point of releasing the website VM we’ll make sure the instructions for running the search system are up to date. If we get time, we may also look at bundling that system as another VM, though we don’t have plans to do that any time soon.
We’re still tweaking our set up and we’ve yet to figure out exactly how to package up the Pfam website virtual machine, but, eventually, we hope to have VM images that we can simply upload to our FTP site for general use. If you have any questions about the VMs, or comments or suggestions on any of this, let us know. You can find our help desk email address at the bottom of every page in the Pfam website.
Posted by John.