Humans carry broken vitamin C genes

Overview

  • Humans and all other simian primates lack a functional copy of the GULO gene, which encodes the enzyme required for the final step of vitamin C biosynthesis. The remnant of this gene—a pseudogene—sits on human chromosome 8 with the same pattern of missing exons found in chimpanzees, gorillas, and orangutans.
  • The identical exon deletions and accumulated mutations shared across these species are best explained by inheritance from a common ancestor that lost GULO function approximately 40–65 million years ago, before the haplorrhine primates diverged.
  • Guinea pigs also cannot synthesize vitamin C, but their GULO pseudogene carries entirely different mutations—demonstrating an independent loss event and ruling out the possibility that the shared primate pattern arose by coincidence.

Most mammals produce their own vitamin C. Dogs, cats, cows, rats, and mice all manufacture L-ascorbic acid internally through a four-enzyme biochemical pathway, the last step of which is catalyzed by an enzyme called L-gulonolactone oxidase, encoded by the GULO gene.1 Humans, however, cannot perform this final step. We carry a broken copy of the GULO gene—a pseudogene designated GULOP—that is riddled with deletions and mutations rendering it incapable of producing a functional enzyme.2, 3 This is why humans must obtain vitamin C from their diet, and why prolonged deficiency causes scurvy, a disease that killed millions of sailors before its cause was understood.1

The broken GULO gene is far more than a metabolic curiosity. When researchers compared the pseudogene sequences of humans, chimpanzees, gorillas, orangutans, and macaques, they found the same pattern of missing exons and the same accumulated point mutations in every species examined.4, 5 This shared pattern of damage is one of the most compelling pieces of molecular evidence for the common ancestry of primates, because the probability of identical complex mutations arising independently in multiple lineages is vanishingly small.5, 6

The vitamin C biosynthesis pathway

Vitamin C (L-ascorbic acid) is an essential cofactor in numerous enzymatic reactions, including the hydroxylation of collagen, the biosynthesis of carnitine and certain neurotransmitters, and the regulation of gene expression. It also serves as a potent antioxidant.1 In most mammals, vitamin C is synthesized in the liver through a four-step pathway that converts D-glucose to L-ascorbic acid. The final and rate-limiting step of this pathway is the oxidation of L-gulonolactone to L-ascorbic acid, catalyzed by the enzyme L-gulonolactone oxidase (GULO).1, 2

The GULO gene in species that retain vitamin C synthesis—such as rats, mice, and most other mammals—contains 12 exons and encodes a protein of approximately 440 amino acids. The gene is highly conserved across vertebrates that maintain this function, with amino acid identity ranging from 64% to 95% among species with functional copies.7 This strong conservation reflects the importance of the enzyme: natural selection acts to preserve its sequence because organisms that lose it must obtain vitamin C externally or face serious physiological consequences.7

Despite its importance, the ability to synthesize vitamin C has been lost independently in several vertebrate lineages. In addition to haplorrhine primates (which include humans, apes, Old World monkeys, and New World monkeys), the capacity has been lost in guinea pigs, some bat species, and certain passerine birds.6, 8 In every case studied, the loss is due to mutations that disable the GULO gene specifically, leaving the upstream enzymes in the pathway intact.6

The human GULO pseudogene

In 1991, Morimitsu Nishikimi and Kunio Yagi at the Aichi Medical University in Japan established the molecular basis for human vitamin C deficiency. By comparing human genomic DNA with the known rat GULO gene sequence, they demonstrated that humans possess a recognizable but nonfunctional homologue of the gene—a pseudogene—on chromosome 8 at position p21.1.2 Three years later, Nishikimi and colleagues cloned and mapped this pseudogene in detail, confirming that it retains some exonic sequences from the ancestral gene but has lost several exons entirely.3

In 2003, Yumiko Inai, Yuri Ohta, and Morimitsu Nishikimi published the complete genomic structure of the human GULOP pseudogene. Compared to the 12 exons present in the functional rat GULO gene, the human pseudogene retains only five recognizable exons (corresponding to exons 4, 7, 9, 10, and 12 of the rat gene). The remaining seven exons—1, 2, 3, 5, 6, 8, and 11—have been deleted or degraded beyond recognition.9 The surviving exons, moreover, have accumulated numerous point mutations including premature stop codons, frameshift mutations, and nonconservative amino acid substitutions that would prevent the gene from producing a functional protein even if its missing exons were somehow restored.9

The human GULOP pseudogene spans approximately 17 kilobases of genomic DNA and is flanked by repetitive elements, including Alu sequences and other transposable elements that have inserted into the degraded gene over millions of years.9 The pseudogene is transcriptionally silent—it produces no mRNA and no protein—making it a true molecular fossil: a remnant of a once-functional gene preserved in the genome by the slow pace of neutral DNA deletion.2, 9

Shared mutations across primates

The most significant finding for evolutionary biology came when researchers compared the GULO pseudogene across multiple primate species. In 1999, Yuri Ohta and Morimitsu Nishikimi sequenced the remaining exonic regions of the GULO pseudogene in chimpanzees, orangutans, and macaques and compared them with the human sequence. They found that all four species share the same pattern of missing exons and, crucially, have accumulated many of the same point mutations in the surviving exonic sequences.5

Ohta and Nishikimi reported that the nucleotide substitution rate in the primate GULO pseudogene sequences was consistent with the neutral rate of molecular evolution—exactly what would be expected for a nonfunctional sequence freed from the constraints of natural selection.5 The pattern of substitutions followed the neutral theory prediction: transitions (purine-to-purine or pyrimidine-to-pyrimidine changes) outnumbered transversions (purine-to-pyrimidine or vice versa) at the ratio expected for randomly drifting DNA, and the rate of nonsynonymous substitutions equaled that of synonymous substitutions—a hallmark of sequences no longer encoding a functional protein.5

A 2024 study by Lachapelle and colleagues extended this analysis further, examining the GULO pseudogene region across a broader range of primates and incorporating data from the Neanderthal genome. They confirmed that the same chromosome 8 inversion and exon mutations are conserved across haplorrhine primates, including the Neanderthal sequence obtained from the high-coverage Altai genome, reinforcing the conclusion that the gene was inactivated once in the common ancestor of all haplorrhine primates and has been inherited as a shared broken sequence ever since.10

Exons retained in the GULO pseudogene across species3, 4, 9

Exon (rat numbering) Human Chimpanzee Orangutan Guinea pig
1 Deleted Deleted Deleted Not identified
2 Deleted Deleted Deleted Present
3 Deleted Deleted Deleted Present
4 Present (mutated) Present (mutated) Present (mutated) Present
5 Deleted Deleted Deleted Not identified
6 Deleted Deleted Deleted Present
7 Present (mutated) Present (mutated) Present (mutated) Present
8 Deleted Deleted Deleted Present
9 Present (mutated) Present (mutated) Present (mutated) Present
10 Present (mutated) Present (mutated) Present (mutated) Present
11 Deleted Deleted Deleted Present
12 Present (mutated) Present (mutated) Present (mutated) Present

The guinea pig comparison

Guinea pigs (Cavia porcellus) are among the few non-primate mammals that cannot synthesize vitamin C, a fact known since the early twentieth century and exploited in scurvy research for decades. In 1992, Nishikimi, Kawai, and Yagi cloned and characterized the guinea pig GULO gene and found that it, too, exists as a nonfunctional pseudogene—but one with a completely different pattern of damage from the primate version.4

The guinea pig pseudogene retains most of its exons but is riddled with point mutations throughout the coding sequence, including three premature stop codons and numerous nonconservative amino acid changes. The regions corresponding to exons 1 and 5 of the rat gene could not be identified, suggesting deletion or extreme degradation. Critically, the specific mutations found in the guinea pig pseudogene do not overlap with those found in the primate pseudogene: different exons are missing, different stop codons have arisen at different positions, and different nucleotide substitutions have accumulated.4, 9

This contrast is precisely what evolutionary theory predicts. If humans and guinea pigs lost GULO function independently—as the phylogenetic evidence strongly indicates, since the rodent and primate lineages diverged approximately 85–90 million years ago—then each lineage should have accumulated its own unique set of random mutations in the nonfunctional gene after inactivation.6, 7 Conversely, species that share a more recent common ancestor (such as humans and chimpanzees, which diverged roughly 6–7 million years ago) should share the same mutations inherited from that ancestor. This is exactly what the data show: humans and chimpanzees share the same broken GULO, while guinea pigs have a differently broken GULO.4, 5

Percentage of original GULO exons retained in selected species4, 7, 9

Rat (functional)
12/12
Guinea pig
10/12
Human
5/12
Chimpanzee
5/12
Orangutan
5/12

Pseudogenes as molecular fossils

The GULO pseudogene is an example of a broader category of genomic evidence for common ancestry: shared pseudogenes. A pseudogene is a DNA sequence that resembles a known functional gene but has been rendered nonfunctional by mutations such as deletions, insertions, frameshifts, or premature stop codons. Because pseudogenes are no longer subject to the purifying force of natural selection (which normally removes harmful mutations from functional genes), they accumulate random changes at the neutral rate of mutation.11

The logic of shared pseudogenes as evidence for common ancestry is straightforward. When two species share not only the same pseudogene at the same chromosomal location but also the same disabling mutations within that pseudogene, the most parsimonious explanation is that the gene was inactivated once in a common ancestor and inherited by both descendant lineages. The alternative—that the exact same complex set of mutations arose independently in two lineages by chance—becomes increasingly improbable as the number of shared mutations increases.5, 6

To appreciate why, consider the mathematics. A single nucleotide position can mutate to any of three alternative bases. For two lineages to independently acquire the same point mutation at the same position, the probability is roughly one in three. But for two lineages to independently acquire the same ten point mutations at the same ten positions, the probability drops to approximately one in 59,000. The human and chimpanzee GULO pseudogenes share dozens of identical mutations in their surviving exonic sequences, making independent origin statistically untenable.5

Other shared pseudogenes

The GULO pseudogene is far from the only shared pseudogene that provides evidence for primate common ancestry. The human genome contains approximately 20,000 pseudogenes, many of which are shared with other primates in patterns consistent with the known phylogeny.11 Several of the best-studied examples reinforce the same evolutionary conclusion.

The olfactory receptor (OR) gene family offers a particularly striking case. The human genome contains roughly 800 OR genes, but approximately 60% of them are pseudogenes—a much higher proportion than in other great apes, where roughly 28–32% are pseudogenes.12, 13 Yoav Gilad and colleagues at the Max Planck Institute compared OR genes across humans, chimpanzees, and orangutans and found that many of the same OR genes are pseudogenized in humans and chimpanzees but functional in more distantly related species. The pattern of shared OR pseudogenes tracks the known phylogenetic tree: humans and chimpanzees share the most pseudogenized OR genes, with progressively fewer shared with gorillas, orangutans, and Old World monkeys.12

The loss of GULO function in bats provides a further evolutionary parallel. Jie Cui and colleagues examined GULO genes from 16 bat species and found that vitamin C synthesis ability had been lost multiple times independently within the order Chiroptera, with different bat lineages carrying different patterns of GULO pseudogenization. Some bat species retain fully functional GULO genes, while closely related species carry pseudogenes with lineage-specific mutations—a pattern of progressive pseudogenization that mirrors the primate story on a smaller timescale.8

Timing of the loss

The phylogenetic distribution of the broken GULO gene allows researchers to estimate when the loss occurred. All haplorrhine primates examined to date—including humans, great apes, Old World monkeys (such as macaques), and New World monkeys (such as marmosets)—share the same nonfunctional GULO pseudogene with the same missing exons.5, 10 By contrast, strepsirrhine primates (lemurs and lorises) retain a functional GULO gene and can synthesize vitamin C.10

This distribution places the inactivating event at or before the divergence of the haplorrhine and strepsirrhine primate lineages, which molecular clock estimates place at approximately 65–75 million years ago, or at the base of the haplorrhine radiation roughly 55–65 million years ago.6, 10 The loss likely became fixed in the ancestral haplorrhine population because these primates consumed a fruit-rich diet that provided abundant dietary vitamin C, relaxing the selective pressure to maintain endogenous synthesis. Once the gene became nonfunctional, it was free to accumulate further mutations without consequence—producing the degraded pseudogene observed today.6, 7

The neutral rate of nucleotide substitution can be used as a rough molecular clock to confirm this timing. Ohta and Nishikimi calculated that the rate of substitution in the primate GULO pseudogene was consistent with approximately 40 million years of neutral drift, which aligns well with the divergence dates of the major haplorrhine lineages.5 More recent analyses incorporating broader genomic data have refined the estimate but have not fundamentally changed the conclusion: the GULO gene was lost once, early in haplorrhine primate evolution, and all living haplorrhines inherited the same broken copy.10

Evolutionary significance

The broken vitamin C gene in humans and other primates carries a significance that extends well beyond the biology of a single metabolic pathway. It represents one of the clearest and most intuitive demonstrations of how shared genomic errors reveal shared ancestry. Unlike similarities in anatomy or physiology, which could in principle arise through convergent evolution driven by similar environmental pressures, shared pseudogene mutations are selectively neutral and therefore carry no adaptive explanation for their similarity. The only plausible explanation for why humans, chimpanzees, gorillas, and orangutans all carry the same set of broken exons, at the same chromosomal location, with the same accumulated point mutations, is that they inherited this damaged sequence from a common ancestor.5, 6, 10

The guinea pig comparison makes the argument even stronger. If the shared primate mutations were somehow the result of a mutational "hotspot" that made certain mutations especially likely to occur, we would expect guinea pigs—which also lost GULO function—to show the same pattern of damage. They do not. The guinea pig pseudogene is broken in an entirely different way, exactly as predicted by the hypothesis that each lineage independently lost the gene and subsequently accumulated its own unique set of random mutations.4, 6

The GULO pseudogene thus serves as a particularly elegant piece of evidence in the broader mosaic of molecular data supporting human evolution and common ancestry. It is a molecular fossil—a genomic scar left by an ancient mutational event—that has been faithfully copied and passed down through tens of millions of years of primate evolution, recording in its sequence the branching pattern of the primate family tree.5, 9, 10

References

1

Molecular basis for the deficiency in humans of gulonolactone oxidase, a key enzyme for ascorbic acid biosynthesis

Nishikimi, M. & Yagi, K. · American Journal of Clinical Nutrition 54(6): 1203S–1208S, 1991

open_in_new
2

A human enzyme that loses its activity during the evolutionary process: L-gulono-γ-lactone oxidase

Nishikimi, M., Fukuyama, R., Minoshima, S., Shimizu, N. & Yagi, K. · Vitamins (Japan) 67: 155–160, 1993

open_in_new
3

Cloning and chromosomal mapping of the human nonfunctional gene for L-gulono-γ-lactone oxidase, the enzyme for L-ascorbic acid biosynthesis missing in man

Nishikimi, M., Fukuyama, R., Minoshima, S., Shimizu, N. & Yagi, K. · Journal of Biological Chemistry 269: 13685–13688, 1994

open_in_new
4

Guinea pigs possess a highly mutated gene for L-gulono-γ-lactone oxidase, the key enzyme for L-ascorbic acid biosynthesis missing in this species

Nishikimi, M., Kawai, T. & Yagi, K. · Journal of Biological Chemistry 267: 21967–21972, 1992

open_in_new
5

Random nucleotide substitutions in primate nonfunctional gene for L-gulono-γ-lactone oxidase, the missing enzyme in L-ascorbic acid biosynthesis

Ohta, Y. & Nishikimi, M. · Biochimica et Biophysica Acta 1472: 408–411, 1999

open_in_new
6

The genetics of vitamin C loss in vertebrates

Drouin, G., Godin, J.-R. & Pagé, B. · Current Genomics 12(5): 371–378, 2011

open_in_new
7

Conserved or lost: molecular evolution of the key gene GULO in vertebrate vitamin C biosynthesis

Yang, H. · Biochemical Genetics 51: 413–425, 2013

open_in_new
8

Progressive pseudogenization: vitamin C synthesis and its loss in bats

Cui, J., Pan, Y.-H., Zhang, Y., Jones, G. & Zhang, S. · Molecular Biology and Evolution 28(2): 1025–1031, 2011

open_in_new
9

The whole structure of the human nonfunctional L-gulono-γ-lactone oxidase gene—the gene responsible for scurvy—and the evolution of repetitive sequences thereon

Inai, Y., Ohta, Y. & Nishikimi, M. · Journal of Nutritional Science and Vitaminology 49(5): 315–319, 2003

open_in_new
10

Conservation of a chromosome 8 inversion and exon mutations confirm common gulonolactone oxidase gene evolution among primates, including H. neanderthalensis

Lachapelle, M. Y. & Bhatt, P. R. · Journal of Molecular Evolution 92: 338–350, 2024

open_in_new
11

Identification and analysis of unitary pseudogenes: historic and contemporary gene losses in humans and other primates

Zhang, Z. D. et al. · Genome Biology 11: R26, 2010

open_in_new
12

Human specific loss of olfactory receptor genes

Gilad, Y., Man, O., Pääbo, S. & Lancet, D. · PNAS 100(6): 3324–3327, 2003

open_in_new
13

A comparison of the human and chimpanzee olfactory receptor gene repertoires

Go, Y. & Niimura, Y. · Genome Research 15(2): 224–230, 2005

open_in_new

expand_less