We are pleased to announce the release of Dfam 1.3. This release includes almost 200 new repeat families and updates the underlying human genome to hg38.
New families, new seeds, new genome
We’ve added 185 new repeat families found in the human genome. Seed alignments for all families have been updated with an improved construction script, using only hg38 sequences. New HMMs were built from these seeds, and new gathering/trusted thresholds were computed for these updated HMMs. All hits to the human genome are now shown relative to hg38.
Behind the Scenes
A lot of the work that’s gone in to this release isn’t readily visible. We (especially Robert) have put a lot of effort into reducing the computational burden of adding and modifying families. We’ve also laid some of the groundwork necessary for an expansion to additional organisms in the upcoming 2.0 release (e.g. DFAMSEQ now includes several new organisms), and for a looming move to infrastructure at the University of Montana. We’re excited about big upcoming changes. As always, if you have suggestions or would like to contribute models to the database, please get in contact.
Posted by Travis Wheeler and Robert Hubley