Endogenous Retroviruses

Last updated: February 2, 2026

Endogenous retroviruses (ERVs) are viral sequences permanently integrated into germline DNA and inherited across generations. They constitute approximately 8% of the human genome1 and serve as molecular fossils of ancient infections spanning tens of millions of years.

How ERVs Form

Retroviruses replicate by inserting a DNA copy of their RNA genome into host chromosomes. This occurs through a multi-step process:

  1. Reverse transcription: The viral enzyme reverse transcriptase converts viral RNA into double-stranded DNA in the cytoplasm.
  2. Nuclear import: The pre-integration complex enters the nucleus.
  3. Integration: Viral integrase inserts the proviral DNA into chromosomal DNA, creating 4-6 base pair target site duplications (TSDs) flanking the insertion.2
  4. Germline transmission: If integration occurs in a germ cell (sperm or egg), the provirus becomes heritable.

Once integrated, ERVs evolve at the neutral mutation rate of the host genome (~2 × 10-9 substitutions per site per year).3

ERV Structure

A complete ERV contains:

Most ERVs have accumulated mutations rendering them non-functional. Many exist as "solo LTRs" — remnants left after recombination between the two LTRs deleted the internal genes.4

ERVs in the Human Genome

Measure Value Source
Percentage of genome ~8% Lander et al. 20011
Total ERV insertions ~98,000-100,000 Multiple studies
HERV families identified 26-31 Vargiu et al. 20165
ERVs shared with chimpanzees >99% Chimpanzee Sequencing Consortium 20056

Major HERV Families

Evidence for Common Ancestry

1. Orthologous Insertions

Humans and chimpanzees share ERV insertions at identical genomic positions. The 2005 chimpanzee genome project identified 336 orthologous ERV-containing sequences between humans and chimps across syntenic chromosomes.6

For any single ERV, the probability of independent insertion at the exact same location in two species is approximately:

P ≈ 1/(3×109 × 0.01) ≈ 1 in 30 million

For independent insertion of hundreds of thousands of ERVs at identical positions, the probability becomes vanishingly small. The observed pattern is consistent with inheritance from a common ancestor.

2. Phylogenetic Congruence

ERV phylogenies match established species phylogenies. Benveniste (1999) demonstrated that trees constructed from HERV sequences are consistent with primate evolutionary relationships derived from morphology and other genetic data.7

The distribution of ERVs reflects proviral age: older insertions appear in widely divergent species, while younger insertions are limited to closely related species.

3. Species-Specific Insertions

Some ERV insertions occurred after lineages diverged:

Lineage Species-Specific HERV-K Insertions
Human 73 (7 full-length, 66 solo LTRs)6
Chimpanzee 45 (1 full-length, 44 solo LTRs)6

Twelve of the 29 human-specific HERV-K elements are polymorphic in modern human populations, indicating recent insertion.8

4. The PtERV1 Example

PtERV1 (Pan troglodytes endogenous retrovirus 1) has over 200 copies in the chimpanzee genome, more than half still full-length. It is present in:

PtERV1 is absent in humans, orangutans, and gibbons.6 This pattern indicates the virus invaded the germline after the human-chimpanzee divergence but was present in the common ancestor of African great apes.

Functional Co-option

Natural selection has repurposed some ERV elements for host functions.

Syncytins: Placental Development

In 2000, Mi et al. discovered that Syncytin-1, derived from the HERV-W envelope gene, is essential for human placental development.9 The protein mediates fusion of cytotrophoblasts into the syncytiotrophoblast layer.

Syncytin Origin Species Integration Time
Syncytin-1 HERV-W env Catarrhine primates >25 mya
Syncytin-2 HERV-FRD env Catarrhine primates >40 mya
Syncytin-A, -B Murine ERVs Muridae (mice, rats) ~20 mya
Syncytin-Car1 Carnivore ERV Carnivora (dogs, cats) ~85 mya
Syncytin-Rum1 Ruminant ERV Ruminantia ~30 mya

Syncytins have been independently captured at least 10 times across mammalian evolution — a striking example of convergent molecular evolution.10

Arc: Neuronal Communication

The Arc (Activity-regulated cytoskeleton-associated) gene derives from a Ty3/gypsy retrotransposon Gag protein. Arc self-assembles into virus-like capsids that encapsulate RNA and mediate intercellular transfer between neurons.11

Arc is essential for synaptic plasticity, long-term potentiation, and memory consolidation. It is present across all tetrapods, indicating co-option occurred hundreds of millions of years ago.

Regulatory Elements

ERV LTRs contain transcription factor binding sites and have been co-opted as promoters and enhancers:

Contemporary Endogenization: Koala Retrovirus

The koala retrovirus (KoRV) provides a real-time example of endogenization. KoRV-A became endogenous between 50,000 and 120 years ago — the youngest known endogenizing retrovirus.14

Geographic Distribution

Population KoRV Status Copy Number
Northern Australia (Queensland) Endogenous in all individuals ~70 copies/genome
Southern Australia (Victoria) 25.8% positive <1 copy average

A 2024 study analyzing 111 pedigreed koalas documented both elimination of ERV insertions from the population (714 integrations lost) and de novo germline integrations (21 new insertions absent in parents).15 This demonstrates ERV dynamics occurring on observable timescales.

ERVs and Disease

Aberrant ERV expression has been implicated in several conditions:

ERV Condition Mechanism
HERV-W (MSRV) Multiple Sclerosis Envelope protein triggers inflammation16
HERV-K Amyotrophic Lateral Sclerosis Aberrant expression in motor neurons17
Various HERVs Cancers Reactivation in tumor cells, potential oncogenic roles18
KoRV Koala lymphoma, leukemia Insertional mutagenesis, immunosuppression

HERV-K: The Youngest Human ERVs

The HERV-K (HML-2) family contains the most recently active human ERVs:

Summary

ERVs provide multiple independent lines of evidence consistent with common ancestry:

  1. Shared insertions: Over 99% of human ERVs are found at identical positions in chimpanzees.
  2. Phylogenetic consistency: ERV distributions match species trees derived from other data.
  3. Molecular decay: Accumulated mutations correlate with estimated divergence times.
  4. Functional co-option: Independent capture of syncytins across mammalian lineages demonstrates ongoing evolutionary processes.
  5. Contemporary observation: KoRV documents endogenization in real time.

References

  1. Lander ES, et al. (2001). Initial sequencing and analysis of the human genome. Nature 409:860-921.
  2. Craigie R, Bushman FD. (2012). HIV DNA integration. Cold Spring Harbor Perspectives in Medicine.
  3. Nachman MW, Crowell SL. (2000). Estimate of the mutation rate per nucleotide in humans. Genetics 156:297-304.
  4. Hughes JF, Coffin JM. (2004). Human endogenous retrovirus K solo-LTR formation and insertional polymorphisms. PNAS 101:1668-1672.
  5. Vargiu L, et al. (2016). Classification and characterization of human endogenous retroviruses. Retrovirology 13:7.
  6. Chimpanzee Sequencing and Analysis Consortium. (2005). Initial sequence of the chimpanzee genome and comparison with the human genome. Nature 437:69-87.
  7. Benveniste RE. (1999). Constructing primate phylogenies from ancient retrovirus sequences. PNAS 96:10254-10260.
  8. Wildschutte JH, et al. (2016). Discovery of unfixed endogenous retrovirus insertions in diverse human populations. PNAS 113:E2326-2334.
  9. Mi S, et al. (2000). Syncytin is a captive retroviral envelope protein involved in human placental morphogenesis. Nature 403:785-789.
  10. Lavialle C, et al. (2013). Paleovirology of 'syncytins', retroviral env genes exapted for a role in placentation. Philosophical Transactions of the Royal Society B 368:20120507.
  11. Pastuzyn ED, et al. (2018). The neuronal gene Arc encodes a repurposed retrotransposon Gag protein that mediates intercellular RNA transfer. Cell 172:275-288.
  12. Santoni FA, et al. (2012). HERV-H RNA is abundant in human embryonic stem cells and a precise marker for pluripotency. Retrovirology 9:111.
  13. Chuong EB, et al. (2017). Endogenous retroviruses function as species-specific enhancer elements in the placenta. Nature Genetics 45:325-329.
  14. Tarlinton RE, et al. (2006). Real-time reverse transcriptase PCR for the endogenous koala retrovirus reveals an association between plasma viral load and neoplastic disease. Journal of Virology 80:3401-3407.
  15. Hogg CJ, et al. (2024). Multi-generational pedigree analysis reveals ongoing endogenous retrovirus dynamics in koalas. Nature Communications.
  16. Kremer D, et al. (2019). The role of HERV-W in multiple sclerosis. Journal of NeuroVirology.
  17. Li W, et al. (2015). Human endogenous retrovirus-K contributes to motor neuron disease. Science Translational Medicine.
  18. Kassiotis G. (2014). Endogenous retroviruses and the development of cancer. Journal of Immunology.
  19. Subramanian RP, et al. (2011). Identification, characterization, and comparative genomic distribution of the HERV-K (HML-2) group of human endogenous retroviruses. Retrovirology 8:90.