LINE-1 preTa elements in the human genome

Abdel Halim Salem, Louisiana State University
Jeremy S. Myers, Louisiana State University
Anthony C. Otieno, Louisiana State University
W. Scott Watkins, University of Utah Health Sciences
Lynn B. Jorde, University of Utah Health Sciences
Mark A. Batzer, Louisiana State University

Abstract

The preTa subfamily of long interspersed elements (LINEs) is characterized by a three base-pair "ACG" sequence in the 3′ untranslated region, contains approximately 400 members in the human genome, and has low level of nucleotide divergence with an estimated average age of 2.34 million years old suggesting that expansion of the L1 preTa subfamily occurred just after the divergence of humans and African apes. We have identified 362 preTa L1 elements from the draft human genomic sequence, investigated the genomic characteristics of preTa L1 insertions, and screened individual elements across diverse human populations and various non-human primate species using polymerase chain reaction (PCR) assays to determine the phylogenetic origin and levels of human genomic diversity associated with the L1 elements. All of the preTa L1 elements analyzed by PCR were absent from the orthologous positions in non-human primate genomes with 33 (14%) of the L1 elements being polymorphic with respect to insertion presence or absence in the human genome. The newly identified L1 insertion polymorphisms will prove useful as identical by descent genetic markers for the study of human population genetics. We provide evidence that preTa L1 elements show an integration site preference for genomic regions with low GC content. Computational analysis of the preTa L1 elements revealed that 29% of the elements amenable to complete sequence analysis have apparently escaped 5′ truncation and are essentially full-length (approximately 6kb). In all, 29 have two intact open reading frames and may be capable of retrotransposition. © 2003 Elsevier Science Ltd. All rights reserved.