Dfam 3.7 : ~3.4 million TE models across 2346 taxa

January 12, 2023

We at Dfam are pleased to announce the latest data release! The Dfam 3.7 release includes additional raw and curated datasets, resulting in a ~4.5x increase in the number of families compared to the previous Dfam 3.6 data release over a wide range of taxa. Please note the large size of the newest release and plan accordingly. It may be beneficial to filter and download the relevant data to your project by utilizing the API. 

EBI dataset contributes to the quadrupling of the Dfam database 

Our continued collaboration with Fergal Martin and Denye Ogeh from the European Bioinformatics Institute (EBI) has provided an additional 771 assemblies and their associated TE models that are now a part of the DR records in Dfam. This brings the total contribution of genomic data from EBI to 1551 species. The new data expands taxa such as Viridiplantae (green plants) and Actinopterygii (bony fishes), and broadens Dfam coverage with the addition of Echinodermata (starfishes, sea urchins/cucumbers) and Petromyzontiformes (lampreys). 

Community submissions – adding diversity to Dfam

Taro (Colocasia esculenta) – a threatened food staple

One of the most ancient cultivated crops, taro is a food staple in the Pacific Islands and the Caribbean, which is currently threatened by taro leaf blight (TLB). Some populations of taro are resistant to TLB, but the genetic basis for this resistance is unknown. As part of an effort to understand the genetic basis of TLB resistance, a taro de novo assembly was generated and the repetitive content was analyzed [1]. The high repetitive content (~82%) of this genome was positively correlated with genome size, with the potential to be linked to TLB resistance. Contributed by M. Renee Bellinger.

Gesneriaceae – understanding angiosperm morphological variation

A member of the plant family Gesneriaceae, the Cape Primrose Streptocarpus rexii has long been studied by evolutionary biologists due to its unique morphological aspects. Genetic resources are critical in order to study the unique meristem evolution of this plant family. As such, a genome annotation pipeline was generated in order to handle the shortcomings of current technical challenges of genome annotation. Part of this effort included generating repeat libraries for not only the Cape Primrose, but also for Dorcoceras hygrometricum and Primulina huaijiensis [2]. Providing these libraries to Dfam will enhance the resources available for future genomic characterization of this plant family.  Contributed by Kanae Nishii.

Mosquito (Anopheles coluzzii) – a human malaria vector

The adaptive flexibility of Anopheles coluzzii, a primary vector of human malaria, allows it escape efforts to control the mosquito population with insecticides. As TEs are integral to adaptive processes in other species, it was hypothesized that TEs could be what is allowing the rapid resistance of A. coluzzii to classic methods of intervention. Analyzing six individuals from two African localities allowed the authors to provide a comprehensive TE library [3]. This effort enhances the resources available to study the genomic architecture and gene regulation underpinning the success of this malaria vector. Contributed by Carlos Vargas and Josefa Gonzalez.

Water flea (Daphnia pulicaria) – a model organism to study climate change

Due to their short lifespans and reproductive capabilities, water fleas are used as a bioindicator to study the effects of toxins on an ecosystem, and are thus useful in studying climate change. A study of two ecological sister taxa – Daphnia pulicaria and Daphnia pulex – analyzed the evolutionary forces of recombination and gene density in driving the differentiation and divergence of the two aforementioned species [4]. TE content was analyzed as part of generating the new Daphnia pulicaria genome assembly.  Contributed by Mathew Wersebe.

601 insects – transposable element influence on species diversity 

TEs are drivers of evolution eukaryotes. However, in some underrepresented taxa, TE dynamics are less well understood. To this end, 601 insect genomes over 20 Orders were analyzed for TE content to analyze the variation between and among insect Orders. This work highlights the need for community-submitted high-quality libraries.  Contributed by John Sproul and Jacqueline Heckenhauer.

Analysis of six bat genomes – evolution of bat adaptations

Bats are an excellent example of complex adaptations, such as flight, echolocation, longevity and immunity. In order to enhance the genomic resources to study the development of complex traits, six high-quality genomes assemblies using long- and short-read technologies were generated (Rhinolophus ferrumequinumRousettus aegyptiacusPhyllostomus discolorMyotis myotisPipistrellus kuhlii and Molossus molossus) [6]. As part of the effort to annotate these new genome assemblies, the TE content was analyzed. These six genomes displayed a wide range of diversity in TE content, perhaps contributing to their complex traits.  Contributed by Kevin Sullivan and David Ray.

LTR7/ERVH – transcriptional regulation in the human embryo

The mechanism by which human endogenous retrovirus type-H (HERVH) exerts regulatory activities fostering self-renewal and pluripotency in the pre-implantation embryo is unknown. In order to elucidate the aforementioned mechanism, the transcription dynamics and sequence signature evolution of HERVH were analyzed [7]. This study not only revealed previously undefined LTR7 subfamilies, but also provided a comprehensive phytoregulatory analysis of all the identified subfamilies against locus-specific regulatory data available in genome-wide assays of embryonic stem cells (ESCs), providing evidence for subfamily-specific promoter activity. The complex evolutionary history of LTR7 is mirrored in the transcriptional partitioning that takes place during early embryonic development.  Contributed by Thomas Carter, Cédric Feschotte, and Arian Smit.

References

1. Bellinger, M. R., Paudel, R., Starnes, S., Kambic, L., Kantar, M. B., Wolfgruber, T., Lamour, K., Geib, S., Sim, S., Miyasaka, S. C., Helmkampf, M., & Shintaku, M. (2020). Taro Genome Assembly and Linkage Map Reveal QTLs for Resistance to Taro Leaf Blight. G3 (Bethesda, Md.)10(8), 2763–2775. https://doi.org/10.1534/g3.120.401367

    2. Nishii, K., Hart, M., Kelso, N., Barber, S., Chen, Y. Y., Thomson, M., Trivedi, U., Twyford, A. D., & Möller, M. (2022). The first genome for the Cape Primrose Streptocarpus rexii (Gesneriaceae), a model plant for studying meristem-driven shoot diversity. Plant direct6(4), e388. https://doi.org/10.1002/pld3.388

    3. Vargas-Chavez, C., Longo Pendy, N. M., Nsango, S. E., Aguilera, L., Ayala, D., & González, J. (2022). Transposable element variants and their potential adaptive impact in urban populations of the malaria vector Anopheles coluzziiGenome research32(1), 189–202. https://doi.org/10.1101/gr.275761.121

    4. Wersebe, M. J., Sherman, R. E., Jeyasingh, P. D., & Weider, L. J. (2022). The roles of recombination and selection in shaping genomic divergence in an incipient ecological species complex. Molecular ecology, 10.1111/mec.16383. Advance online publication. https://doi.org/10.1111/mec.16383

    5. Sproul, J.S., Hotaling, S., Heckenhauer, J., Powell, A., Larracuente, A.M., Kelley, J.L., Pauls, S.U., Frandsen, P.B. (2022). Repetitive elements in the era of biodiversity genomics: insights from 600+ insect genomes. bioRxiv 2022.06.02.494618; doi: https://doi.org/10.1101/2022.06.02.494618

    6. Jebb, D., Huang, Z., Pippel, M., Hughes, G. M., Lavrichenko, K., Devanna, P., Winkler, S., Jermiin, L. S., Skirmuntt, E. C., Katzourakis, A., Burkitt-Gray, L., Ray, D. A., Sullivan, K. A. M., Roscito, J. G., Kirilenko, B. M., Dávalos, L. M., Corthals, A. P., Power, M. L., Jones, G., Ransome, R. D., … Teeling, E. C. (2020). Six reference-quality genomes reveal evolution of bat adaptations. Nature583(7817), 578–584. https://doi.org/10.1038/s41586-020-2486-3

    7. Carter, T. A., Singh, M., Dumbović, G., Chobirko, J. D., Rinn, J. L., & Feschotte, C. (2022). Mosaic cis-regulatory evolution drives transcriptional partitioning of HERVH endogenous retrovirus in the human embryo. eLife11, e76257. https://doi.org/10.7554/eLife.76257

    Advertisement

    Leave a Reply

    Fill in your details below or click an icon to log in:

    WordPress.com Logo

    You are commenting using your WordPress.com account. Log Out /  Change )

    Twitter picture

    You are commenting using your Twitter account. Log Out /  Change )

    Facebook photo

    You are commenting using your Facebook account. Log Out /  Change )

    Connecting to %s

    %d bloggers like this: