Phylogenetic analysis of the Friedreich ataxia GAA trinucleotide repeat

Cristina M. Justice, LSU Health Sciences Center - New Orleans
Zhining Den, LSU Health Sciences Center - New Orleans
Son V. Nguyen, LSU Health Sciences Center - New Orleans
Mark Stoneking, Max Planck Institute for Evolutionary Anthropology
Prescott L. Deininger, Tulane University School of Public Health and Tropical Medicine
Mark A. Batzer, LSUHSC School of Medicine
Bronya J.B. Keats, LSUHSC Neuroscience Center

Abstract

Friedreich ataxia is an autosomal recessive neurodegenerative disorder associated with a GAA repeat expansion in the first intron of the gene (FRDA) encoding a novel, highly conserved, 210 amino acid protein known as frataxin. Normal variation in repeat size was determined by analysis of more than 600 DNA samples from seven human populations. This analysis showed that the most frequent allele had nine GAA repeats, and no alleles with fewer than five GAA repeats were found. The European and Syrian populations had the highest percentage of alleles with 10 or more GAA repeats, while the Papua New Guinea population did not have any alleles carrying more than 10 GAA repeats. The distributions of repeat sizes in the European, Syrian, and African American populations were significantly different from those in the Asian and Papua New Guinea populations (p < 0.001). The GAA repeat size was also determined in five nonhuman primates. Samples from 10 chimpanzees, 3 orangutans, 1 gorilla, 1 rhesus macaque, 1 mangabey, and 1 tamarin were analyzed. Among those primates belonging to the Pongidae family, the chimpanzees were found to carry three or four GAA repeats, the orangutans had four or five GAA repeats, and the gorilla carried three GAA repeats. In primates belonging to the Cercopithecidae family, three GAA repeats were found in the mangabey and two in the rhesus macaque. However, an AluY subfamily member inserted in the poly(A) tract preceding the GAA repeat region in the rhesus macaque, making the amplified sequence approximately 300 bp longer. The GAA repeat was also found in the tamarin, suggesting that it arose at least 40 million years ago and remained relatively small throughout the majority of primate evolution, with a punctuated expansion in the human genome.