What Powers the Body’s Powerhouses (DNA)?

What powers the body’s powerhouses?

The Weekly Gene: POLG

Who would have thought that the remnants of an ancient bacterial infection would still be affecting us today? Nearly two billion years ago, our very distant ancestors—then just tiny unicellular organisms—became infected by a bacterium. After eons of coevolution, the bacteria has morphed into what we now know as mitochondria. They’ve long since lost their independence, but they still retain vestiges of their individuality. They even have their own DNA that’s separate from the rest of your genome! In order to maintain this independent DNA, the mitochondria have to be able to replicate it and pass it on when new mitochondria are forming. To do this, they call upon the POLG gene.

Mitochondria are in nearly every cell of your body. Functionally, you can think of them like little organs inside your cells, which is why scientists classify them as organelles. They’re sometimes called the “powerhouses” of the cells, owing to their role in producing energy. Sugars and other nutrients have a lot of energy stored in their chemical bonds. Mitochondria strip metabolites of their energy and repackage it in the form of ATP and other similarly energetic molecules. ATP is then sent throughout the cell where it helps drive various important processes.

The fact that mitochondria can pump out large amounts of energy makes them a hot commodity for cell types that need a lot of energy. Cells like neurons or the muscle cells controlling your eyelids require lots of energy because they’re constantly active and burning through ATP. Aside from differences among cell types, the amount of mitochondria in a cell can vary throughout time based on environmental conditions.1,2 A good example is exercise: We know that prolonged exercise can stimulate muscle cells to produce more mitochondria that will then help them meet their future energy needs. Like your cells, mitochondria can duplicate themselves to make more in a process called mitochondrial biogenesis.1,2 For the new mitochondria to function, however, they need instructions for how to build the energy-producing machinery.1

In the mitochondrial DNA of each cell, there are approx 16,500 base pairs made from the nucleotides A, C, G, and T. Within this code are 37 genes that collectively help the mitochondria build their energy producing pipeline. But these 37 genes don’t directly help mitochondria replicate themselves—they need almost 1,500 different genes for that. These genes are found in the rest of your DNA and are responsible for exporting their protein products to the mitochondria.

One of these genes is polymerase gamma (POLG). POLG produces a protein by the same name, also called POLG, which is in a class of proteins known as polymerases. These proteins are essential for replicating DNA sequences, because they bind to the existing DNA and construct a new DNA strand based on the template strand they’re bound to. There are multiple different polymerases, but only one of them can go into mitochondria—POLG. Aside from replicating the DNA, POLG has also been shown to have error-correcting functions so that it can proofread the new template it’s making.2

POLG is critical to the process of making new mitochondria and, in turn, is critical to human health. Highlighting this point are the nearly 300 different variants in this gene that are known to cause disease. Many of these variants alter POLG’s protein structure in a way that reduces its ability to either interact with the DNA or proofread the new duplicate DNA strand, which can result in an increased mutation rate within the mitochondrial DNA and reduced mitochondrial biogenesis. On a physiological scale, this affects organs and tissues that are reliant on mitochondrial energy production such as the nervous system, eyes, muscles, kidneys, pancreas, and male reproductive organs. The penetrance and expressivity of these symptoms are wide-ranging and depend on many factors that aren’t entirely understood, but ongoing research is aiming to characterize how changes in our DNA—including the mitochondrial DNA—can lead to disease development.1,2

The mitochondria is no longer a bacteria, but it’s evolved to be an integral part of human physiology. Thanks to genes like POLG, mitochondria are able to hold onto their DNA and a small portion of their independence.


1Jornayvaz, François R., and Gerald I. Shulman. “Regulation of Mitochondrial Biogenesis.” Essays in biochemistry 47 (2010): 10.1042/bse0470069. PMC. Web. 7 May 2018.2Young, Matthew J., and William C. Copeland. “Human Mitochondrial DNA Replication Machinery and Disease.” Current opinion in genetics & development 38 (2016): 52–62. PMC. Web. 8 May 2018.

Characterizing the Admixed African Ancestry of African Americans

This article is relevant today as African American seek out their connections and ancestry with people of Africa. This questions the use and reliance on DNA testing with companies in the US.

. 2009; 10(12): R141.
Published online 2009 Dec 22. doi:  10.1186/gb-2009-10-12-r141
PMCID: PMC2812948
PMID: 20025784

Characterizing the admixed African ancestry of African Americans



Accurate, high-throughput genotyping allows the fine characterization of genetic ancestry. Here we applied recently developed statistical and computational techniques to the question of African ancestry in African Americans by using data on more than 450,000 single-nucleotide polymorphisms (SNPs) genotyped in 94 Africans of diverse geographic origins included in the HGDP, as well as 136 African Americans and 38 European Americans participating in the Atherosclerotic Disease Vascular Function and Genetic Epidemiology (ADVANCE) study. To focus on African ancestry, we reduced the data to include only those genotypes in each African American determined statistically to be African in origin.


From cluster analysis, we found that all the African Americans are admixed in their African components of ancestry, with the majority contributions being from West and West-Central Africa, and only modest variation in these African-ancestry proportions among individuals. Furthermore, by principal components analysis, we found little evidence of genetic structure within the African component of ancestry in African Americans.


These results are consistent with historic mating patterns among African Americans that are largely uncorrelated to African ancestral origins, and they cast doubt on the general utility of mtDNA or Y-chromosome markers alone to delineate the full African ancestry of African Americans. Our results also indicate that the genetic architecture of African Americans is distinct from that of Africans, and that the greatest source of potential genetic stratification bias in case-control studies of African Americans derives from the proportion of European ancestry.


Numerous studies have estimated the rate of European admixture in African Americans; these studies have documented average admixture rates in the range of 10% to 20%, with some regional variation, but also with substantial variation among individuals []. For example, the largest study of African Americans to date, based on autosomal short tandem repeat (STR) markers, found an average of 14% European ancestry with a standard deviation of approximately 10%, and a range of near 0 to 65% [], whereas another study based on ancestry informative markers (AIMs) found an average of 17.7% European ancestry with a standard deviation of 15.0% []. By using nine AIMs, Parra and colleagues [] found substantial variation of European ancestry proportions in African-American populations across the United States, ranging from just over 10% in a Philadelphia group to more than 20% in a New Orleans population. Similar levels (11% to 15%) of European ancestry also were reported by Tishkoff and co-workers [], based on more than 1,000 nuclear microsatellite and insertion/deletion markers.

Although much attention has been paid in the genetics literature to the continental admixture underlying the genetic makeup of African Americans, less attention has been paid to the within-continental contribution to African Americans, in particular from the continent of Africa. Studies have focused primarily on the matrilineally inherited mitochondrial DNA (mtDNA) and patrilineally inherited Y chromosome []. These two DNA sources have gained wide prominence owing, in part, to their use by ancestry-testing companies to identify the regional and ethnic origins of their subscribers. Yet these two sources provide a very narrow perspective in delineating only two of possibly thousands of ancestral lineages in an individual.

The majority of African Americans derive their African ancestry from the approximately 500,000 to 650,000 Africans that were forcibly brought to British North America as slaves during the Middle Passage [,]. These individuals were deported primarily from various geographic regions of Western Africa, ranging from Senegal to Nigeria to Angola. Thus, it has been estimated that the majority of African Americans derive ancestry from these geographic regions, although more central and eastern locations also have contributed []. Recent studies of African and African-American mtDNA haplotypes and autosomal microsatellite markers also confirmed a broad range of Western Africa as the likely roots of most African Americans [,].

The recent development of high-density single-nucleotide polymorphism (SNP) genotyping assays, used primarily in genome-wide association (GWA) studies, has also provided unprecedented opportunities to address questions related to the evolution and migration patterns of humans. Most of the GWA studies to date have focused on European or European-derived populations of U.S. Caucasians, but a few have included minorities. The latter studies provide unique opportunities to address questions of ancestral origins in admixed populations, such as African Americans and Latinos [].

Although the application of high-density genotyping to a broad range of worldwide indigenous populations has not yet been accomplished, an important first step has been achieved through the recent genotyping of the Human Genome Diversity Panel (HGDP). This effort resulted in nearly 1,000 subjects from 51 populations being genotyped at more than 500,000 polymorphic sites [,]. These data now provide a basis for finer-scale analysis of the ancestral origins of admixed groups, such as African Americans and Latinos, in addition to enabling the accurate characterization of genetic and evolutionary relationships among these populations.

In this study, we characterize the African origins of African Americans by making use of the high-density genotype data generated for 94 HGDP indigenous Africans from differing geographic and linguistic groups, including 21 Mandenka from West Africa, 21 Yoruba from West Central Africa, 15 Bantu speakers from Southwestern and Eastern Africa, 20 Biaka Pygmy and 12 Mbuti Pygmy from Central Africa, and five San from Southern Africa []. These subjects are used to represent the potential African ancestors of 136 African Americans recently genotyped in a GWA study of early-onset coronary artery disease (ADVANCE) []. In addition, we include 38 U.S. Caucasian subjects from ADVANCE to represent the European ancestors of the African Americans.

The use of high-density SNP data for ancestral reconstruction presents some unique statistical and computational challenges. To this end, we previously developed analytic techniques for estimating individual ancestry (IA) from multiple populations (frappe), as well as for the reconstruction of ancestry blocks in admixed individuals (saber) by using data from more than 450,000 SNP genotypes [,]. Here, we provide a unique application of saber to identify the ancestral origins of each of the more than 450,000 genotypes in African-American individuals, to reduce the analysis to those genotypes that are exclusively of African origin. We note that 58 of the ADVANCE African Americans were also participants of the CARDIA study and had previously been analyzed with 42 Ancestry Informative Markers []. We also used principal components analysis (PCA) to define the genetic structure, and in particular the African genetic structure, underlying African Americans. Another recent study used principal components analysis for the African populations of HGDP, but did not relate those results to African Americans []. To our knowledge, the analyses reported here represent the first effort to characterize the African origin of African Americans by isolating the African-derived genome in each African American individual.


African and European ancestry in African Americans

Principal components analysis of more than 450,000 SNPs, including all populations (Africans, African Americans, and US Caucasians), revealed, as expected, a major separation between the African and U.S. Caucasian populations along the first principal component (PC1), whereas the second principal component (PC2) led to the separation of the various African groups (Figure (Figure1).1). The two pygmy populations (Biaka, Mbuti) and the San of South Africa are well separated from the other African groups, whereas a greater genetic affinity appears to exist between the Mandenka of West Africa, the Yoruba of Central West Africa, and the Bantu speakers, who derive from Kenya and Southwestern Africa. It is also clear in Figure Figure11 that the African Americans lie on a direct line between the European Americans and the West Africans, reflecting their varying levels of admixture between these two ancestral groups.

An external file that holds a picture, illustration, etc. Object name is gb-2009-10-12-r141-1.jpg

Principal components analysis of Africans, U.S. Caucasians, and African Americans. Inset bar plot displays individual ancestry estimates for African Americans from a supervised structure analysis by using frappe with K = 7, fixing six African and one U.S. Caucasian populations. The color scheme of the bar plot matches that in the PCA plot.

These results were confirmed in the estimation of IA by using the program frappe (also in Figure Figure1).1). The amount of European ancestry shows considerable variation, with an average (± SD) of 21.9% ± 12.2%, and a range of 0 to 72% (Table (Table1).1). The largest African ancestral contribution comes from the Yoruba, with an average of 47.1% ± 8.7% (range, 18% to 64%), followed by the Bantu at 14.8% ± 5.0% (range, 3% to 28%) and Mandenka at 13.8% ± 4.5% (range, 3% to 29%). The contributions from the other three African groups were quite modest, with an average of 1.7% from the Biaka, 0.5% from the Mbuti, and 0.3% from the San. In the bar plot of frappe estimates, individuals (vertical bars) are arranged in order (left to right) corresponding to their value on the first PC coordinate. Clearly, this order correlates nearly perfectly with a decreasing proportion of European ancestry (Figure S1 in Additional file 1). Thus, the most important source of genetic structure in African Americans is based on the degree of European admixture.

Table 1

Estimates of European ancestry and proportional African ancestries in African Americans by US region of birth

U.S. region of birth Numbera European ancestry (%) Total African ancestry (%)b
Mandenka Yoruba Bantu Biaka Mbuti San

West 58 (58) 19.9 ± 8.5 18.9 ± 4.1 64.0 ± 5.3 13.7 ± 4.3 1.1 ± 0.8 0.2 ± 0.2 2.0 ± 0.5
South 12 (10) 24.0 ± 15.6 22.6 ± 5.7 60.0 ± 9.5 14.2 ± 5.4 1.1 ± 0.7 0.2 ± 0.4 1.9 ± 1.0
Midwest 4 (4) 19.4 ± 10.2 19.4 ± 2.0 64.0 ± 5.5 13.1 ± 5.5 0.9 ± 0.9 0.3 ± 0.3 2.2 ± 0.7
Southwest 2 (2) 17.0 ± 6.5 21.4 ± 0.7 65.1 ± 1.0 10.5 ± 0.3 1.1 ± 0.4 0.1 ± 0.0 1.7 ± 1.0
All 136 (128) 21.9 ± 12.2 19.2 ± 4.0 63.7 ± 4.9 13.8 ± 3.8 1.0 ± 0.8 0.2 ± 0.3 2.0 ± 0.6

aNumbers in parentheses are those used for estimation of African ancestries after removal of eight individuals with high values of European ancestry; birth-location information was missing for 60 individuals.

bBased on frappe analysis of African genotypes only (n = 128).

African components of ancestry in African Americans

We estimate that, on average, nearly 80% of the ancestry in our samples of African Americans is of African origin. A careful examination of the African component of ancestry in the African Americans is facilitated by restricting the analysis to those portions of their genomes that are exclusively of African origin. To do so, we used the program saber to infer European- versus African-derived alleles for each individual, and retained for analysis only those loci that had a high probability of harboring two African-derived alleles, while denoting the other genotypes as missing. For these and all subsequent analyses, we included the 128 African Americans whose estimated African ancestry exceeded 55%, based on the initial frappe analysis (see Methods).

As a validation of the accuracy of this partitioning procedure, we performed PCA on the combined set of U.S. Caucasians, Africans, and the African Americans with putative non-African-derived genotypes removed (that is, coded as missing). For comparison, we also examined the results of the same analysis, but including all of the genotype data of the African Americans. For these analyses, we included only the three African population groups that, based on the first analysis, contributed significantly to the African Americans (the Mandenka, Yoruba, and Bantu). As shown previously, when all genotypes are included, the African Americans lie intermediate between the Africans and European Americans, at varying distances based on their degree of admixture (Figure (Figure2a).2a). By contrast, when only the putative African-derived genotypes in the African Americans are included, the African Americans now cluster tightly with the Africans (Figure (Figure2b).2b). This result provides confidence that the application of saber has enabled us to partition accurately the genomes of the African Americans with regard to European versus African ancestry.

An external file that holds a picture, illustration, etc. Object name is gb-2009-10-12-r141-2.jpg

Principal components analysis of Africans, U.S. Caucasians, and African Americans including (a) all genotypes, and (b) only the genotypes of African origin in the African Americans. Comparison of (a) and (b) demonstrates the effective elimination of the European ancestry component from African Americans by using saber.

We then characterized the African ancestry in African Americans by performing PCA and estimating IA with frappe by using the U.S. Caucasians, Africans, and African Americans, with non-African genotypes removed. To determine whether we could distinguish the African populations from one another, we first ran frappe including all the 94 African individuals (setting K = 6). This unsupervised analysis unambiguously separated the San and Pygmy populations from the West Africans and, to a lesser degree, the three West African populations (Yoruba, Mandenka, and Bantu). To be confident in the groupings of the West African population, we performed a series of leave-one-out frappe analyses that include 57 individuals from the three West African populations: in each frappe run, we fixed all individual within their respective populations except for one, whose ancestry was allowed to be admixed and estimated (see Methods). Results are given in Figure S2 in Additional file 1. The close genetic relationship of these three groups is evidenced by the imperfect ancestry allocation to an individual’s own population. However, in every case, frappe assigns the majority ancestry to an individual’s own population, and in most cases, the large majority. The Bantu appear to have closest ancestry to the Yoruba. This is consistent with the Nigerian origins of the Yoruba and the presumed origins of the Bantu from the southwestern modern boundary of Nigeria and Cameroon [], and the subsequent migration of the Bantu east and south [,].

Figure Figure33 displays the PCA results of the African Americans and the three closely related African populations (Yoruba, Mandenka, and Bantu). Several features are worth comment. First, despite their genetic similarity, PCA shows clear separation among the Yoruba, Mandenka, and Bantu populations, based on the first two PCs. Second, Figure Figure33 reveals that the African Americans are placed as a single cluster in the convex hull defined by the three African groups.

An external file that holds a picture, illustration, etc. Object name is gb-2009-10-12-r141-3.jpg

Principal components analysis of three West and Central West African populations (Mandenka, Yoruba, and Bantu) and African Americans by using only African-origin genotypes in the African Americans.

Figure Figure44 presents the results of the frappe analysis of the 128 African Americans, in which the six HGDP African populations and Caucasians from ADVANCE were included in the analysis as fixed groups, and proportional ancestry estimated for the African Americans. Consistent with Figure Figure1,1, Figure Figure44 shows that all African Americans are estimated to have significant ancestry from each of the three West and Central West African groups (Mandenka, Yoruba, and Bantu), with only modest variation among individuals in their ancestral proportions from these three groups. As expected, little to no European ancestry is estimated in this frappe analysis.

An external file that holds a picture, illustration, etc. Object name is gb-2009-10-12-r141-4.jpg

Individual ancestry estimates in African Americans by using only their African genotypes, from a supervised structure analysis with frappe, including all six African populations and U.S. Caucasians as fixed (K = 7). Color coding of populations is the same as that in Figure 1.

Table Table11 provides the averages and standard deviations of IA derived from the frappe analysis described earlier (Figure (Figure4)4) for the African components of African ancestry for the 128 African Americans. Overall, we estimate within-Africa contributions of 64%, 19%, and 14% from Yoruba, Mandenka, and Bantu, respectively. The variances for the various African IA components are much smaller than those for the European IA and are roughly similar across groups (SD ranging from 0.038 to 0.049). These observations are consistent with visual inspection of the bar chart in Figure Figure4,4, that African Americans generally derive substantial ancestry from all three West and Central West African population groups. We also note from Table Table11 that no significant differences exist among African-American subgroups defined by U.S. region of birth, in terms of IA estimates for any African ancestral component, nor are any significant differences in IA found, based on gender (data not shown).

Thus, the PC and frappe analyses of the 128 African Americans based only on their African-derived genotypes suggest a lack of genetic structure within the African component of their ancestry. To assess this question further, we performed an additional PC analysis on only the African Americans, including only the African-derived genotypes for each individual.

Figure Figure55 shows the PCA restricted to African-derived genotypes within the African Americans. In this case, each PC accounts for a very modest amount of variance, and no clear pattern is evident. The distribution of the proportion of variance explained by each PC revealed a continuous distribution with no outliers (data not shown).

An external file that holds a picture, illustration, etc. Object name is gb-2009-10-12-r141-5.jpg

Principal components analysis of African Americans based on African-derived genotypes only. Little evidence for structure exists in the African component of ancestry.

To confirm that this lack of structure was not an artifact of removing genotype data, we performed a similar PC analysis on the original 94 Africans, but randomly deleting genotypes from these subjects at a rate comparable to the genotype removal rate in the African Americans (see Methods). Results are shown in Figure S3a (full genotype data) and Figure S3b (genotype data removed) in Additional file 1. As can be seen, the two figures appear nearly identical, each demonstrating the structure that exists among these African populations. Thus, the deletion of genotypes did little to diminish the display of population structure, and so the lack of structure that we observed within the African Americans (after removing non-African genotypes) is unlikely due to missing genotype data.

Another question relates to potential impact of missing genotypes on the frappe analysis of the African Americans. Individuals with high levels of European ancestry (who have more genotype data removed) provide less information regarding their African ancestral components, and thus the variance of the African components of IA increases with the amount of European ancestry, but not in a directional way.


As expected, PCA on our entire sample revealed the greatest genetic differentiation between the US Caucasians and the Africans, with the African Americans intermediate between them, reflecting their recent admixture between ancestors from Europe and Africa. Our estimate of European individual admixture (IA) in the African Americans was also roughly consistent with prior studies [], with an average of 21.9%. We found considerable variation among individuals in terms of European IA, and a number of individuals with particularly high European IA values (eight individuals of 136, or 6% with values greater than 45%).

Prior studies focusing on mtDNA and Y chromosomes have found a greater African and lesser European representation of mtDNA haplotypes compared with Y chromosome haplotypes in African Americans, suggesting a greater contribution of African matrilineal descent compared with patrilineal descent [,]. For example, Kayser and colleagues [] estimated that 27.5% to 33.6% of Y chromosomes in African Americans are of European origin, compared with 9.0% to 15.4% of mtDNA haplotypes.

One study of nine short tandem repeat (STR) loci compared the Y chromosomes of African Americans with those of various African populations, including West Africans, West Central Africans (Cameroon), South Africans, Mbuti Pygmies, Mali, San, and Ethiopians []. In a multiple dimensional scaling analysis, these authors placed the African Americans in the middle of these African groups, suggesting origins from multiple African populations. However, they also found that they could not differentiate the Y-chromosome distributions of West African and West Central African groups, presumably a major source of ancestry for African Americans.

Another study of mtDNA haplotypes in African Americans and different African populations found that more than 50% of the African-American mtDNAs exactly matched common haplotypes shared among multiple African ethnic groups, whereas 40% matched no sequences in the African database they referenced []. Fewer than 10% of African-American mtDNA haplotypes matched exactly to a single African ethnic group. The haplotypes that did match were more often found in ethnic groups of West African or Central West African than of East or South African origin.

The most extensive examination of mtDNA haplotypes in Africans and African Americans [] used mtDNA data from a large number of African ethnic groups spread around the continent. These authors observed large similarities in mtDNA profiles among ethnic groups from West, Central West, and South West Africa, with a continuous geographic gradient. As observed previously [], these authors also found that many mtDNA haplotypes were widely distributed across Africa, making it impossible to trace African ancestry to a particular region or group, based on mtDNA data alone. These authors also estimated the proportionate ancestry within Africa based on African American mtDNA haplotypes as 60% from West Africa, 9% from Central West Africa, 30% from South West Africa, and minimal ancestry from North, East, Southeast, or South Africa.

These studies all suggest close genetic kinship among various West African, Central West African, and South West African ethnic groups. A prior analysis of genetic structure among the African populations included in the HGDP based on 377 autosomal STR loci was able to define distinct genetic clusters for the Biaka, Mbuti, and San; however, the study lacked the power to differentiate the Mandenka, Yoruba, and Bantu groups []. Similarly, another study examining two ethnic groups from Ghana (Akan and Gaa-Adangbe) and two from Nigeria (Yoruba, Igbo), based on 372 autosomal microsatellite markers in 493 individuals, did not differentiate these groups by genetic cluster analysis and found only modest genetic differences between them []. In contrast, greater resolution of African ethnic groups, particularly for the Mandenka and Yoruba, was possible in our analysis, based on more than 450,000 SNPs. We note that, in a recent study of malaria, PCA distinguished the HapMap YRI individuals from the Mandenka individuals in the Gambian sample on the basis of 100,715 SNPs; however, admixture analysis with a few selected markers did not reveal clear clusters that correspond to self-reported ancestry [].

It is of interest to compare our African admixture estimates to descriptions of proportional representation of various African groups to the Middle Passage and slave trade occurring in post-Columbian America. A highly detailed census based on historic records has been documented by several authors []. Africans were deported from numerous locations along the broad western coast of Africa, ranging from Senegal in the far west all the way down to Angola in the southwest. In addition, a smaller number of slaves were taken from the southeast of Africa. In terms of numbers, the largest group, approximately 50% to 60%, derived from Central and Southern West Africa and the Bight of Biafra; approximately 10% from Western Africa; 25% to 35% from the West Coast in between (Windward Coast, Gold Coast, and Bight of Benin), and the remaining 5% from Southeast Africa []. These estimates show considerable consistency with our results, which also indicated the largest ancestral component of African Americans to be from Central West Africa, followed by West Africa and Southwest Africa. However, because we did not have groups representative of Southeastern and other parts of Southern Africa, we may have underestimated their ancestral representation among African Americans.

It is important to note that considerable migration has occurred among African ethnic groups over the past three millennia or more. For example, the two Bantu groups included in our analysis originated from a more-central African location (Nigeria-Cameroon) several millennia ago, making precise geographic localization of African ancestry difficult []. This difficulty is also reflected in the close genetic relationships among the various West, West Central, and South West African groups, who also show considerable overlap in terms of mtDNA haplotypes.

Our results are based on examination of the entire autosomal genome and, therefore, provide a more-robust picture of the admixed African ancestry of individual African Americans compared with prior analyses, which focused on only a single locus (mtDNA or Y chromosome). We found all African Americans in our sample to be admixed, with representation from various geographic regions of Western Africa. The amount of variation in the African components of ancestry among the African Americans was quite modest, suggesting considerable similarity in African genetic profiles among African Americans. Thus, African ancestry testing based on a single locus, such as the mtDNA or Y chromosome, as is commonly done by ancestry-testing companies, provides only a very limited, and in many cases, misleading picture of an individual’s African ancestry [].

An important limitation in our analysis is the modest number of African subjects and groups represented. However, we were clearly able to exclude certain African ethnic groups as contributing substantially to African Americans, such as the two Pygmy and San groups. Furthermore, the close genetic similarity observed among West, Central West, and Southwest African ethnic groups (such as the Mandenka, Yoruba, and Bantu), found by us and others [], suggests that precise identification of ancestry for African Americans may be difficult, even with the inclusion of additional ethnic groups.

Very recently, the limited range of African groups included in population genetic studies of Africans was addressed in a landmark study of 113 geographically diverse African ethnic groups by Tishkoff and co-workers []. These authors included 848 microsatellite, 476 indel, and four SNP markers. to examine genetic structure among these groups, as well as among 98 African Americans from four U.S. recruitment sites. In a genetic cluster analysis, they found only modest differentiation among West Africans, similar to the findings from other studies of a subset of these groups, based on a comparable number of markers. They also estimated proportionate African ancestry among their African Americans in a structured analysis including African ethnic subgroups, allowing the African Americans to be admixed. Comparable to our results, within the African Americans, they also found the majority African ancestry to be West, Central West, and Southwest African, including Bantu and non-Bantu speakers, with somewhat greater representation of the Bantu speakers (about 50% of the African total component) than the Western non-Bantu speakers (for example, Mandenka, about 30% of the African total component). Larger collections of indigenous African populations, such as those described earlier [], when assayed with dense genotyping arrays, as done in this study (to allow finer genetic differentiation), will likely add further clarification of the African ancestral origins of African Americans.

The results of our analysis also strongly point to random mating among African Americans with respect to the African components of their ancestry. This is reflected both by the modest variances we observed in the African IA components, and also by the lack of structure in the PC analysis of African Americans with non-African genotypes removed. This conclusion is consistent with the idea that, for most African Americans, specific African origins are mixed or unknown or both and do not affect social characteristics that influence the choice of mate. It is also consistent with the notion that the African slaves brought to North America were mixed with regard to their geographic and ethnic ancestry and language []. By contrast, considerably greater variation in the proportion of European ancestry was found within the African Americans in our study. This high level of variation in European ancestry may reflect recent admixture or nonrandom mating (for example, as seen in Latino populations []), or both; these questions require additional study.


African Americans typically have African and European genetic ancestry. We sought to characterize the African ancestry of African Americans by using data on more than 450,000 SNPs genotyped in 94 Africans of diverse geographic origins, as well as 136 African Americans and 38 U.S. Caucasians. To focus on African ancestry, we reduced the data to include only those genotypes in each African American that are African in origin. We found that all the African Americans are admixed in the African component of their ancestry, with estimated contributions of 19% West (for example, Mandenka), 63% West Central (for example, Yoruba), and 14% South West Central or Eastern (for example, Bantu speakers), with little variation among individuals. Furthermore, we found little evidence of genetic structure within the African component of ancestry in African Americans, but significant structure related to the proportion of European ancestry. These results are consistent with mating patterns among African Americans that are unrelated to African ancestral origins, cast doubt on the general utility of mtDNA or Y-chromosome markers alone to delineate the full African ancestry of African Americans, and show that the proportion of European ancestry is the leading source of stratification bias in genetic case-control studies of African Americans.

Materials and methods

Selection of populations and individuals

Individuals included in analyses presented here come from two studies. A total of 102 indigenous African individuals and their genotype data were obtained from the Human Genome Diversity Project (HGDP) and comprised five San, 22 Biaka Pygmy, 13 Mbuti Pygmy, 22 Mandenka, 21 Yoruba, 11 Kenyan Bantu, and eight Southwest African Bantu (one Pedi, one Southern Sotho, two Tswana, one Zulu, two Herero, and one Ovambo). In total, eight individuals were removed from analyses for the following reasons: three Kenyan Bantu had significant Middle Eastern ancestry, based on previous analysis []; and three additional Kenyan Bantu and two Mandenka were removed because they were first cousins to other included subjects. This left a total of 94 indigenous Africans for analysis. The 136 self-described African-American individuals studied represent a subset of participants of the Atherosclerosis, Vascular Function and Genetic Epidemiology (ADVANCE) study [] selected for genotyping in the context of a GWA case-control study of early-onset coronary artery disease (CAD). From the ADVANCE study, we also randomly sampled 38 of 590 US Caucasians to anchor the European component of African-American ancestry. Thus, in total, 268 individuals are included in this study.

All ADVANCE subjects were recruited from the membership of Kaiser Permanente of Northern California. Among the 136 African Americans, 49 (36%) were affected with CAD (with first presentation at younger than 45 year for male and 55 years for female subjects), and 36 (26.4%) were male subjects. Of the 87 controls, frequency matched by age to the cases, 58 represented participants in the Coronary Artery Risk Development in Young Adults (CARDIA) study originally recruited at the Kaiser Oakland field center who attended the study’s Year 15 examination in 2000 to 2001 [,]. For 76 (55.9%) of these African-American individuals, we had information on state of birth, with 58 stating they were born in the West (California), 12 in the South (Alabama, Louisiana, Mississippi, Virginia), four in the Midwest (Indiana, Michigan, Missouri, Ohio), and two in the Southwest (Texas). The description of recruitment of these subjects can be found elsewhere [].

Genotyping and marker selection

Genotype data were derived from two different research projects. The HGDP individuals were genotyped on the Illumina 650 K Beadarray; experimental protocol and SNP quality-control analysis for the HGDP project and genotyping results were described previously [,]. In total, 938 individuals and 642,690 autosomal SNPs passed all quality-control criteria. Genotype data for U.S. African American and Caucasian individuals were obtained from the ADVANCE study, in which genotyping was performed on the Illumina 550 K Beadarray by the same group of investigators, followed by identical quality-control analysis. After removing markers that were absent from either the HGDP dataset or the ADVANCE dataset, the final combined genotype dataset for all analyses in this study consisted of 454,132 autosomal SNPs.

Population structure and ancestry estimation

We performed PCAs according to the algorithm described by []. Genome-wide European admixture proportions in African-American individuals were estimated by using the program frappe, which implements an Estimation-Maximization (EM) algorithm for simultaneously inferring each individual’s ancestry proportion and allele frequencies in the ancestral populations []. In this analysis, ancestry of the African Americans is allowed to have come from any of the K = 7 ancestral populations: San, Biaka Pygmy, Mbuti Pygmy, Mandenka, Yoruba, Bantu, or European. Ancestries of the indigenous African individuals and U.S. Caucasians were assumed to be homogeneous and fixed. However, to determine the robustness of these assignments for the closely related West and Central West African populations, we performed an additional frappe analysis on just these groups (Mandenka, Yoruba, Bantu; n = 57). We fixed all individuals in their respective population groups (Mandenka, Yoruba, or Bantu), except for one, who was allowed to be admixed, and the admixture was estimated. This procedure was repeated 57 times for each individual, so that each person’s potential admixture was estimated. In this way, we tested the robustness of the population definitions. If the populations are not distinct, then the individual admixture estimates should appear random; by contrast, if an individual’s ancestry is assigned primarily to his or her population of origin, population distinctiveness can be assumed. Furthermore, this analysis provides a closely matched contrast to the African Americans, whose proportionate individual ancestry is estimated in a similar fashion.

Defining African SNP genotypes

To focus exclusively on the African ancestral component, we removed genotypes containing European-derived alleles from the African-American individuals by using the program saber. This program allowed us to infer European versus African ancestry for each SNP genotype in an individual []. Saber implements a Markov-Hidden Markov Model, which infers locus-specific ancestry based on ancestral allele frequencies at each marker, as well as the ancestral haplotype frequencies between pairs of neighboring markers and assumes a block structure for ancestry along a chromosome. For this analysis, saber required the genome-wide average European ancestry for each admixed individual, which was estimated by using frappe, as described earlier (K = 7). We also supplied the estimated African and European ancestral allele frequencies for all SNPs to saber, which improved the estimation of the ancestral haplotype frequencies. Saber produces a posterior estimate of European ancestry at each SNP, which concentrates near 0, 0.5 and 1, corresponding to 0, 1, or 2 European-derived alleles. Although it is feasible to infer phase and ancestry jointly by using saber, we chose to remove SNP genotypes (as opposed to single alleles) in which at least one allele is European derived. Thus, for a given individual, we were left only with SNP genotypes that were highly likely to be homozygous in African origin. The proportion of genotypes removed for an individual is approximately 1 – α2, where α represents the genome-wide estimate of African ancestry for that individual. As a result, the amount of genotype data varied among individuals based on the degree of European versus African ancestry. To allow adequate information about the African component of their genome, we excluded eight individuals with estimated European ancestry of 45% or greater, leaving a total sample of 128 individuals with at least 30% of their genotype data retained. The proportion of genotypes retained ranged from 31% to 99%, with a median of 67% and mean of 66%. In terms of proportion of genotypes retained at individual loci, the mean is the same as stated earlier (66%), with a standard deviation of 0.05. Thus, assuming a normal distribution, 95% of the proportions of genotypes retained across loci lie between 56% and 77%. We note that even after removing genotypes, a large number of marker genotypes are retained for each individual, with a minimum of 143,025.

Genetic structure of the African-derived genome

This analysis focused on IA estimation and PCA based on African-origin SNP genotypes. For IA estimation, we used the program frappe with K = 7 (Yoruba, Mandenka, Bantu, Biaka Pygmy, Mbuti Pygmy, San, and U.S. Caucasians as ancestral individuals). U.S. Caucasians were included in the model to ensure that the European ancestral component had been properly removed from all individuals.

In performing PCA of the Africans and African Americans together, our goal was to understand the relationship between African Americans and Africans. We focused on the 57 West and Central West Africans in this analysis (Yoruba, Mandenka, and Bantu) because these were the only African populations contributing to African-American ancestry. In this case, a standard PCA would be influenced by the much larger sample size of African Americans compared with any of the African groups. Because we were interested in the projection of the African component of ancestry of the African Americans onto the African structure, we instead performed the PCA 128 times, each time including a different single African American whose non-African genotypes had been removed.

In PCAs involving U.S. Caucasian subjects, the same 38 ADVANCE Caucasians were used. All PCAs were performed by using the statistical package R.

To address the question of whether removal of a varying amount of genotype data among individuals would bias the PC analysis, we performed a genotype-reduction procedure on the 94 indigenous African populations, to mimic the reduction of genotype data among the African Americans. We then performed two PCAs, the first based on complete genotype information, and then another based on the reduced genotype data. Significant differences between the results of these analyses would indicate that some bias occurs simply because of the uneven data reduction; lack of differences would indicate the opposite.


ADVANCE: Atherosclerotic Disease Vascular Function and Genetic Epidemiology; AIM: ancestry informative marker; CAD: coronary artery disease; CARDIA: Coronary Artery Risk Development in Young Adults; EM: estimation-maximization; GWA: genome-wide association; HGDP: Human Genome Diversity Panel; IA: individual ancestry; PC: principal component; PCA: principal component analysis; SNP: single nucleotide polymorphism; STR: short tandem repeat.

Authors’ contributions

FZ, HT, and NR conceived of the study, performed the statistical analyses, and drafted the manuscript. AB, DA, and BN contributed to the data analyses. TQ, TLA, JWK, CI, ASG, MAH, and SS are ADVANCE investigators and had the overall responsibility for study design and implementation, including subject recruitment and assessment. RRM, DA, JL, and AS generated high-density SNP genotype data on ADVANCE. All authors contributed to and approved of the manuscript.

Additional files

The following additional files for this article are available online:

Additional file 1 contains three supplementary figures. Figure S1 shows PC1 from PCA of African Americans based on all genotype data versus African IA from frappe analysis. The figure shows near-perfect correlation between PC1 and African IA. Figure S2 shows a Frappe analysis of 57 Yoruba, Mandenka, and Bantu speakers, based on estimating admixed ancestry one individual at a time, fixing all others in their defined population. Results show majority assignment to an individual’s own population group. Figure S3a shows a PCA of indigenous Africans (n = 94) based on all genotype data. Figure S3b shows a PCA of indigenous Africans (n = 94) based on variable removal of genotype data. Note that the figure shows nearly identical genetic structure to that in Figure Figure3a,3a, including the separation of Yoruba, Mandenka, and Bantu.


Supplementary Material

Additional data file 1:

Figure S1 shows PC1 from PCA of African Americans based on all genotype data versus African IA from frappe analysis. The figure shows near-perfect correlation between PC1 and African IA. Figure S2 shows a Frappe analysis of 57 Yoruba, Mandenka, and Bantu speakers, based on estimating admixed ancestry one individual at a time, fixing all others in their defined population. Results show majority assignment to an individual’s own population group. Figure S3a shows a PCA of indigenous Africans (n = 94) based on all genotype data. Figure S3b shows a PCA of indigenous Africans (n = 94) based on variable removal of genotype data. Note that the figure shows nearly identical genetic structure to that in Figure Figure3a,3a, including the separation of Yoruba, Mandenka, and Bantu.


We thank Dr. Sandra Beleza for helpful comments on the manuscript. This research was supported by the National Institutes of Health, including NIGMS grant GM073059 (to HT), and NHLBI grant HL087647 (to TQ). FZ was supported by a Stanford Graduate Fellowship. HT is supported by a Sloan Foundation Research Fellowship. The ADVANCE investigators thank the study participants and the staff who contributed to the ADVANCE study.


  • Tang H, Jorgenson E, Gadde M, Kardia SL, Rao DC, Zhu X, Schork NJ, Hanis CL, Risch N. Racial admixture and its impact on BMI and blood pressure in African and Mexican Americans. Hum Genet. 2006;119:624–633. doi: 10.1007/s00439-006-0175-4. [PubMed] [Cross Ref]
  • Fernandez JR, Shriver MD, Beasley TM, Rafla-Demetrious N, Parra E, Albu J, Nicklas B, Ryan AS, McKeigue PM, Hoggart CL, Weinsier RL, Allison DB. Association of African genetic admixture with resting metabolic rate and obesity among women. Obes Res. 2003;11:904–911. doi: 10.1038/oby.2003.124. [PubMed][Cross Ref]
  • Parra EJ, Marcini A, Akey J, Martinson J, Batzer MA, Cooper R, Forrester T, Allison DB, Deka R, Ferrell RE, Shriver MD. Estimating African American admixture proportions by use of population-specific alleles. Am J Hum Genet. 1998;63:1839–1851. doi: 10.1086/302148. [PMC free article] [PubMed] [Cross Ref]
  • Tishkoff SA, Reed FA, Friedlaender FR, Ehret C, Ranciaro A, Froment A, Hirbo JB, Awomoyi AA, Bodo JM, Doumbo O, Ibrahim M, Juma AT, Kotze MJ, Lema G, Moore JH, Mortensen H, Nyambo TB, Omar SA, Powell K, Pretorius GS, Smith MW, Thera MA, Wambebe C, Weber JL, Williams SM. The genetic structure and history of Africans and African Americans. Science. 2009;324:1035–1044. doi: 10.1126/science.1172257. [PMC free article] [PubMed] [Cross Ref]
  • Coelho M, Sequeira F, Luiselli D, Beleza S, Rocha J. On the edge of Bantu expansions: mtDNA, Y chromosome and lactase persistence genetic variation in southwestern Angola. BMC Evol Biol. 2009;9:80. doi: 10.1186/1471-2148-9-80.[PMC free article] [PubMed] [Cross Ref]
  • Kayser M, Brauer S, Schadlich H, Prinz M, Batzer MA, Zimmerman PA, Boatin BA, Stoneking M. Y chromosome STR haplotypes and the genetic structure of U.S. populations of African, European, and Hispanic ancestry. Genome Res. 2003;13:624–634. doi: 10.1101/gr.463003. [PMC free article] [PubMed] [Cross Ref]
  • Lind JM, Hutcheson-Dilks HB, Williams SM, Moore JH, Essex M, Ruiz-Pesini E, Wallace DC, Tishkoff SA, O’Brien SJ, Smith MW. Elevated male European and female African contributions to the genomes of African American individuals. Hum Genet. 2007;120:713–722. doi: 10.1007/s00439-006-0261-7. [PubMed] [Cross Ref]
  • Segal R. The Black Diaspora: Five Centuries of Black Experience Outside Africa.New York: Farrar, Straus and Giroux; 1995.
  • Thomas H. The Slave Trade: The Story of the Atlantic Slave Trade: 1440-1870.Simon & Schuster; 1999.
  • Curtin PD. The Atlantic Slave Trade. Milwaukie, Wisconsin: University of Wisconsin Press; 1969.
  • Lovejoy PE. Transformations in Slavery: A History of Slavery in Africa. 2. Cambridge: Cambridge University Press; 2000.
  • Eltis D. The volume and structure of the transatlantic slave trade: a reassessment. William Mary Q. 2001;58:17–46. [PubMed]
  • Salas A, Carracedo A, Richards M, Macaulay V. Charting the ancestry of African Americans. Am J Hum Genet. 2005;77:676–680. doi: 10.1086/491675.[PMC free article] [PubMed] [Cross Ref]
  • Tang H, Choudhry S, Mei R, Morgan M, Rodriguez-Cintron W, Burchard EG, Risch NJ. Recent genetic selection in the ancestral admixture of Puerto Ricans. Am J Hum Genet. 2007;81:626–633. doi: 10.1086/520769. [PMC free article] [PubMed][Cross Ref]
  • Nalls MA, Wilson JG, Patterson NJ, Tandon A, Zmuda JM, Huntsman S, Garcia M, Hu D, Li R, Beamer BA, Patel KV, Akylbekova EL, Files JC, Hardy CL, Buxbaum SG, Taylor HA, Reich D, Harris TB, Ziv E. Admixture mapping of white cell count: genetic locus responsible for lower white blood cell count in the Health ABC and Jackson Heart studies. Am J Hum Genet. 2008;82:81–87. doi: 10.1016/j.ajhg.2007.09.003. [PMC free article] [PubMed] [Cross Ref]
  • Jimenez-Sanchez G, Silva-Zolezzi I, Hidalgo A, March S. Genomic medicine in Mexico: initial steps and the road ahead. Genome Res. 2008;18:1191–1198. doi: 10.1101/gr.065359.107. [PubMed] [Cross Ref]
  • Jakobsson M, Scholz SW, Scheet P, Gibbs JR, VanLiere JM, Fung HC, Szpiech ZA, Degnan JH, Wang K, Guerreiro R, Bras JM, Schymick JC, Hernandez DB, Traynor BJ, Simon-Sanchez J, Matarin M, Britton A, Leemput J van de, Rafferty I, Bucan M, Cann HM, Hardy JA, Rosenberg NA, Singleton AB. Genotype, haplotype and copy-number variation in worldwide human populations. Nature. 2008;451:998–1003. doi: 10.1038/nature06742. [PubMed] [Cross Ref]
  • Li JZ, Absher DM, Tang H, Southwick AM, Casto AM, Ramachandran S, Cann HM, Barsh GS, Feldman M, Cavalli-Sforza LL, Myers RM. Worldwide human relationships inferred from genome-wide patterns of variation. Science. 2008;319:1100–1104. [PubMed]
  • Assimes TL, Knowles JW, Basu A, Iribarren C, Southwick A, Tang H, Absher D, Li J, Fair JM, Rubin GD, Sidney S, Fortmann SP, Go AS, Hlatky MA, Myers RM, Risch N, Quertermous T. Susceptibility locus for clinical and subclinical coronary artery disease at chromosome 9p21 in the multi-ethnic ADVANCE study. Hum Mol Genet. 2008;17:2320–2328. doi: 10.1093/hmg/ddn132. [PMC free article] [PubMed][Cross Ref]
  • Tang H, Coram M, Wang P, Zhu X, Risch N. Reconstructing genetic ancestry blocks in admixed individuals. Am J Hum Genet. 2006;79:1–12. doi: 10.1086/504302.[PMC free article] [PubMed] [Cross Ref]
  • Tang H, Peng J, Wang P, Risch NJ. Estimation of individual admixture: analytical and study design considerations. Genet Epidemiol. 2005;28:289–301. doi: 10.1002/gepi.20064. [PubMed] [Cross Ref]
  • Reiner AP, Carlson CS, Ziv E, Iribarren C, Jaquish CE, Nickerson DA. Genetic ancestry, population sub-structure, and cardiovascular disease-related traits among African-American participants in the CARDIA study. Hum Genet. 2007;121:565–575. doi: 10.1007/s00439-007-0350-2. [PubMed] [Cross Ref]
  • Reich D, Price AL, Patterson N. Principal component analysis of genetic data. Nat Genet. 2008;40:491–492. doi: 10.1038/ng0508-491. [PubMed] [Cross Ref]
  • Ehret C, Posnansky M, eds. The Archaeological and Linguistic Reconstruction of African History. Berkeley: University of California Press; 1982.
  • Salas A, Richards M, De la Fe T, Lareu MV, Sobrino B, Sanchez-Diz P, Macaulay V, Carracedo A. The making of the African mtDNA landscape. Am J Hum Genet. 2002;71:1082–1111. doi: 10.1086/344348. [PMC free article] [PubMed] [Cross Ref]
  • Ely B, Wilson JL, Jackson F, Jackson BA. African-American mitochondrial DNAs often match mtDNAs found in multiple African ethnic groups. BMC Biol. 2006;4:34. doi: 10.1186/1741-7007-4-34. [PMC free article] [PubMed] [Cross Ref]
  • Rosenberg NA, Pritchard JK, Weber JL, Cann HM, Kidd KK, Zhivotovsky LA, Feldman MW. Genetic structure of human populations. Science. 2002;298:2381–2385. doi: 10.1126/science.1078311. [PubMed] [Cross Ref]
  • Adeyemo AA, Chen G, Chen Y, Rotimi C. Genetic structure in four West African population groups. BMC Genet. 2005;6:38. doi: 10.1186/1471-2156-6-38.[PMC free article] [PubMed] [Cross Ref]
  • Jallow M, Teo YY, Small KS, Rockett KA, Deloukas P, Clark TG, Kivinen K, Bojang KA, Conway DJ, Pinder M, Sirugo G, Sisay-Joof F, Usen S, Auburn S, Bumpstead SJ, Campino S, Coffey A, Dunham A, Fry AE, Green A, Gwilliam R, Hunt SE, Inouye M, Jeffreys AE, Mendy A, Palotie A, Potter S, Ragoussis J, Rogers J, Rowlands K. Genome-wide and fine-resolution association analysis of malaria in West Africa. Nat Genet. in press . [PMC free article] [PubMed]
  • Vansina J. New linguistic evidence and “the Bantu expansion”. J African Hist. 1995;36:173–195. doi: 10.1017/S0021853700034101. [Cross Ref]
  • Shriver MD, Kittles RA. Genetic ancestry and the search for personalized genetic histories. Nat Rev Genet. 2004;5:611–618. doi: 10.1038/nrg1405. [PubMed][Cross Ref]
  • Morgan PD. In: Routes to Slavery: Direction, Ethnicity and Mortality in the Atlantic Slave Trade. Eltis D, Richardson D, editor. London: Routledge; 1997. The cultural implications of the Atlantic slave trade: African regional origins, American destinations and new world developments. pp. 122–145.
  • Risch N, Choudhry S, Via M, Basu A, Sebro R, Eng C, Beckman K, Thyne S, Chapela R, Rodriguez-Santana JR, Rodriguez-Cintron W, Avila PC, Ziv E, Burchard EG. Ancestry-related assortative mating in Latino populations. Genome Biol. 2009;10:R132. doi: 10.1186/gb-2009-10-11-r132. [PMC free article] [PubMed][Cross Ref]
  • Lee CD, Jacobs DR Jr, Hankinson A, Iribarren C, Sidney S. Cardiorespiratory fitness and coronary artery calcification in young adults: the CARDIA study. Atherosclerosis. 2008;203:263–268. doi: 10.1016/j.atherosclerosis.2008.06.012.[PMC free article] [PubMed] [Cross Ref]
  • Iribarren C, Go AS, Husson G, Sidney S, Fair JM, Quertermous T, Hlatky MA, Fortmann SP. Metabolic syndrome and early-onset coronary artery disease: is the whole greater than its parts? J Am Coll Cardiol. 2006;48:1800–1807. doi: 10.1016/j.jacc.2006.03.070. [PubMed] [Cross Ref]
  • Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D. Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet. 2006;38:904–909. doi: 10.1038/ng1847. [PubMed] [Cross Ref]

Articles from Genome Biology are provided here courtesy of BioMed Central


    • Article





Recent Activity

See more…

The Ancient Origins of New Zealanders

Biological anthropologist Professor Lisa Matisoo-Smith is researching the genetic make-up of Kiwis.

Biological anthropologist Professor Lisa Matisoo-Smith is researching the genetic make-up of Kiwis.

Aotearoa was the final destination of a very long journey that began in Africa over 65,000 years ago.  Whether you’re a red-headed country music singer in Gore or a Filipino dairy worker in Dannevirke, your ancestral homeland is Africa.

When a small band of modern humans filtered out of Africa into Europe and Asia, they encountered other human types who had arrived there hundreds of thousands of years before.  Our new breed of taller, seemingly more savvy and better equipped men and women co-existed with Neanderthals for at least 10,000 years before they died out, whether through force or happenstance.

Our common ancestor was Homo erectus.  We were not yet so different from Neanderthals that we couldn’t interbreed.  The encounters were rare and rarely productive but nevertheless, everyone today who is NOT of pure African descent carries a small percentage of Neanderthal DNA, about 2 percent – slightly more in Asian populations who seem to have had additional, later encounters. Those Neanderthal jokes about our colleagues and former boyfriends have rebounded on us.

Skeleton of the Neanderthal boy recovered from the El Sidron cave, Spain.


Skeleton of the Neanderthal boy recovered from the El Sidron cave, Spain.

This genetic legacy has given us some good and bad traits, such as stronger hair and skin, a predisposition to type 2 diabetes and Crohn’s disease, and increased risk of nicotine addiction. Apparently, Neanderthals shared our on/off faculty for appreciating the defining note of pinot noir and violets, a compound called beta ionine.  A single nucleotide difference (a basic component of DNA) distinguishes the active and inactive version of the gene.

Tracing where the first Kiwis came from
Gene analysis project goes way, way back

The first scientist to think of using differences in our DNA to trace our origins and relatedness grew up on a farm in Pukekohe.

Professor Lisa Matisoo-Smith hands out DNA test kits to 50 people in Nelson after introducing the audience to the Allan ...

Martin de Ruyter

Professor Lisa Matisoo-Smith hands out DNA test kits to 50 people in Nelson after introducing the audience to the Allan Wilson Centre project The Longest Journey from Africa to Aotearoa.

The late, great New Zealand scientist, Allan Wilson, who should be a household name here, spent his adult life in America, based at the University of California, Berkeley.  He died in 1991 from leukaemia, aged 56. Wilson deduced that chimpanzees and the first human species diverged from a common ancestor only 5-7 million years ago, not  about 30m as previously thought – a bit too close for comfort for some.

It caused a bitter controversy at the time, and not just among evolution deniers. Scientists are human too, and not always objectively ‘sapiens’. Reputations become nailed to old masts.

Wilson led a group of evolutionary biologists who realised that we could reconstruct human history by studying markers in our mitochondrial DNA (mtDNA), which is inherited lock, stock and barrel from mother, and not mixed up with father’s DNA when sperm meets egg.  Every so often, a spelling mistake, known as a mutation, is made when the DNA is being copied. Once a mutation occurs, it is then passed on to all future generations.

These mtDNA mutations rarely have any effect on the person.  Wilson and his team realised that if they looked at mtDNA from people around the world, they could compare the DNA and draw a family tree, identifying when and where these mutations occurred. The different mtDNA lineages could be used to trace the movement of populations across the globe.

They calculated that all humans alive today trace their origin back to one woman – so-called Mitochondrial Eve – who lived in Africa a mere 150,000 years ago.  This doesn’t mean that she was the only woman on Earth at the time, but that all other lines have since become dead ends, literally.

The different branches of the mitochondrial family tree are labelled by letters, with each branch defined by a particular mutation or combination of mutations.

The oldest lineages are the L branches, which are found only in African populations. About 65,000 years ago, a small group of humans carrying the L3 lineage left Africa, probably through what is now Egypt. This group soon split and the mutations occurred that define the two main non-African lineages, the M and N branches. Women carrying the N lineages gave rise to all European lineages, with the most common branches found in Western Europeans today being H, U, J, T, K, V, and X. These seven Western European maternal ancestors inspired the book The Seven Daughters of Eve by Bryan Sykes.  He named these clan mothers Helena, Ursula, Jasmine, Tara, Katrine, Velda and Xenia.

While Helena, Ursula, Jasmine and the girls went north, some of our ancestors headed east and moved very quickly through southern Asia, towards the Pacific. They could walk through what is now Island Southeast Asia when ice ages locked up massive volumes of water and sea levels fell.  Recent research suggests that they arrived in Australia and New Guinea, which were joined in a super-continent called Sahul, as early as 60-65,000 years ago.  Aboriginal Australians and Papuans have been geographically and genetically isolated for a very long time.

It was a one-way journey for them. These people carried mtDNA lineages belonging to the M branch, as well as some N lineages.

On those early forays into Asia, it seems we also interbred with another group of long-separate Homo erectus descendants called Denisovans, after the cave in Siberia where the relics of these people were miraculously discovered – part of the finger-bone of a small girl and a few teeth – amidst tonnes of rock and dirt.  These treasured remains were so well preserved that scientists were able to sequence the entire genome (the complete set of an organism’s DNA).  Those first modern humans who travelled through Asia clearly ran into Denisovans on the way. Their descendants today, including Aboriginal Australians and many Pacific people, carry up to 5 per cent Denisovan DNA.  Interestingly, this inheritance confers an ability to thrive at high altitudes and is present in the Sherpa people.

Allan Wilson’s work has inspired a generation of evolutionary biologists, including a group of outstanding researchers at the University Otago.  Leader of the allanwilson@otago research group is Professor Lisa Matisoo-Smith, a biological anthropologist who also uses DNA as her archaeological pick-axe. She is fine-tuning what we know about the populations of the Pacific, and Aotearoa in particular.  She recently randomly sampled the DNA of over 2000 New Zealanders to analyse our ancient maternal and paternal lines.

Lisa is currently writing up the results and the stories of some of her New Zealand subjects in a book she plans to publish in 2019, when we will be commemorating the first Maori and European landings here.  But she can tell you the punch line now. We are as diverse a population as you’ll find anywhere. Kiwis carry all of the major mitochondrial DNA diversity seen in the world – lineages A to Z.

The history of human evolution and migration is one of the fastest moving areas of science. New findings, such as fossils of the diminutive Homo floresiensis (the hobbit people), are coming thick and fast and adding intriguing sub-plots to the main storyline.

We have an insatiable desire to know about our past.  Genealogy is big business. But while DNA is hard evidence of our origins, relatedness, and some of the routes taken by our ancestors, it is only part of the story and actually reveals very little about who we are. New Zealanders are not defined by their DNA or bound in spirit by genetic similarity.

What we do share in common are the long journeys we and our forebears risked to come here, whether by waka, sailing ship or 777, to escape depression and social immobility in Britain, Pol Pot’s genocide, wars in Europe and the Middle East, or in search of adventure and a better life.

Our ancestors, all six thousand generations since Mitochondrial Eve, were survivors and we are their testament.

Next week:  Who were the first New Zealanders?  How many were there, and where did they come from?

Information and research provided by Professor Lisa Matisoo-Smith FRSNZ, University of Otago


YFull Tree DNA SNP Search for your Haplogroup

YFull Tree – Y-SNP Search for Your Haplogroup

The various DNA testing companies often use different versions of the Y-chromosome tree. Even though you have tested onto the same branch at multiple companies, that branch may be named differently at each. This can make it hard to Google for resources about your haplogroup.

YFull maintains one of the three versions of the paternal, Y-chromosome, tree of human kind. The names used for haplogroups, tree branches, on their tree are usually in common use in the genetic genealogy community. Therefore, when looking for resources for your haplogroup, it is useful to be able to change to the haplogroup used by Yfull. This tutorial shows how to find your current haplogroup on the YFull tree.


Binary Polymorphism – A genetic change with two possible states. That is positive or negative — derived or ancestral. Most binary polymorphisms on the 2017 tree are Y-SNPs. For simplicity, I usually refer to all types of binary polymorphisms as variants.
Haplogroup – A branch on the Y-chromosome Tree defined by one or more binary polymorphism.
Y-chromosome – The human male sex chromosome. It is passed from a father to his sons each generation with only small random changes.
Y-DNA – The DNA contained on the Y-chromosome.
Y-SNP – This is a genetic change of exactly one base pair to another value, A changes to C. This is a type of binary polymorphism.
– A 3rd party site for Y-DNA results.

How To

Before you start, you should have your haplogroup from one of the Y-DNA testing companies.

Step 1

Go to the YFull tree page, https://yfull.com/tree/.

Step 2

On the top right of the page, click on the Search button.

Step 3

Put the Y-SNP from your haplogroup in the SNP name field. Then click the Search button. In the example, I am searching for the I-P109 haplogroup. The name of the Y-SNP is the information to the right of the dash, so in this case it is P109.

Step 4

In the search results, look for the name of the haplogroup in green on the right. That is the name for your haplogroup on the YFull tree. In the example, the YFull tree haplogroup is I-P109.

Step 5

Click on the haplogroup name to open the YFull tree to it.

What are your thoughts? Join the conversation.
Y-DNA – Applied Genealogy & Paternal Origins

Centimorgans in Genetic Geealogy

Reprinted from the International Society of Genetic Genealogy August 2, 2017. No adjustment was made to this article and is the ISOGG position.


In genetic genealogy, a centiMorgan (cM) or map unit (m.u.) is a unit of recombinant frequency which is used to measure genetic distance. It is often used to imply distance along a chromosome, and takes into account how often recombination occurs in a region. A region with few cMs undergoes relatively less recombination. The number of base pairs to which it corresponds varies widely across the genome (different regions of a chromosome have different propensities towards crossover). One centiMorgan corresponds to about 1 million base pairs in humans on average. The centiMorgan is equal to a 1% chance that a marker at one genetic locus on a chromosome will be separated from a marker at a second locus due to crossing over in a single generation.

The genetic genealogy testing companies 23andMeAncestryDNAFamily Tree DNA and MyHeritage DNA use centiMorgans to denote the size of matching DNA segments in autosomal DNA tests. Segments which share a large number of centiMorgans in common are more likely to be of significance and to indicate a common ancestor within a genealogical timeframe.

The centiMorgan was named in honor of geneticist Thomas Hunt Morgan by his student Alfred Henry Sturtevant. Note that the parent unit of the centiMorgan, the Morgan, is rarely used today.

23andMe and Family Tree DNA both use HapMap to infer their centiMorgans.

centiMorgans vs megabases

CentiMorgans are interpolated numbers that take into consideration each area of a chromosome and its propensity to recombine. This means if two cousins share 40 cM on chromosome 1, and two different cousins share 40 cM on chromosome 5, they both can be predicted to share a certain degree of relationship statistically. Megabases vary slightly in different locations so that in the same scenario, if both sets shared 40 Mb pairs, it would be more difficult to ensure they are of a similar degree of relation without further accounting for location, chromosome and other factors.[1]

Ann Turner provides a useful explanation: “I think of the cM as being a unit of ‘effective’ distance. As an analogy, a mile is a fixed quantity (5280 feet), and so are megabases. But the probability that a person can walk a mile in 20 minutes is more fluid. If the terrain is very rough, the “effective” distance of a literal mile might be more like two miles if you’re trying to arrive at a certain time. We’re more interested in the probability that a segment will be passed on intact than the size of the segment in Mb”.[2]

As the cM is an empirical measure, based on recombination events in a particular dataset of parents and offspring, it can vary somewhat from study to study. This set of maps for each chromosome shows that the general shape of the centiMorgan vs megabase curve is similar for two datasets, but the absolute values are not quite the same:


cm values per chromosome

The following table compares cM values per chromosome at Family Tree DNAGEDmatch, and 23andMeAncestryDNA uses 3475 as the total cM according to the help screen for confidence level in a DNA match. This presumably excludes the X chromosome.

CM chromosome FTDNA&GEDMatch&23andMe.jpg

Probability of crossover

The following chart shows the estimated probability that a segment will be affected by a crossover. The chart does not take into account some variables such as inversions and different recombination rates for males and females.

Crossover probability centiMorgans.png

Converting centiMorgans into percentages

In order to get an approximate percentage of shared DNA from a Family Tree DNA Family Finder test, take all of the segments above 5 cM, add them together and then divide by 68.

The way the calculation works is that your total genome in cMs with the Family Finder test is 6770 cM. A half-identical match (such as a parent/child) is 3385 cM. This number has to be doubled to represent both the maternal and paternal sides giving a total of 6770 cM. Matt Dexter explains: “The reason the number is not 6770 or 6800, but rather 68, is that it saves an additional step doing the math to convert an answer to percent. For example, 3385 / 6770 = .5 then as a second step, .5 times 100 = 50%. Using 68 to start with saves the added math step. So (3385 / 6800) * 100 is the same thing as 3385 / 68, which results in = 50%.”[3]

Human reference genome

The centiMorgan totals per chromosome are based on the Human Reference Genome. 23andMe and Ancestry DNA use Build 37. Family Tree DNA use Build 37 for matching but Build 36 for segment boundaries in the Chromosome Browser. Raw data files are provided in both formats. Build 37 filled in quite a few gaps, and the number of base pairs in each of the chromosomes was longer in Build 37 as compared to Build 36. Consequently the cM totals per chromosome are lower for Family Finder than they are for 23andMe. GedMatch use Build 36, and convert AncestryDNA and 23andMe data from Build 37 to Build 36 for backward compatibility.

The latest version of the Human Reference Genome, Build 38, was released in December 2013. However, none of the companies have as yet adopted Build 38 and there is a “gentleman’s agreement” in place to stick with Build 37 for the present time.

Further reading


How 23andMe identifies your DNA Relatives

Scientific Details

Learn about your DNA Relatives, the diverse group of 23andMe customers who have DNA in common with you. Only individuals who have chosen to participate in DNA Relatives are the subject of this report.

How 23andMe identifies your DNA Relatives

To identify your DNA Relatives, we use an algorithm that finds segments of your DNA that are identical to DNA segments of other 23andMe customers. When these segments are sufficiently long, we infer that they were inherited from a recent common ancestor. These segments are known as “identical by descent,” or IBD. Our algorithm searches for these matches across virtually your entire genome, so we can identify DNA Relatives on any branch of your family tree.

Note: IBD/Half IBD:

The comparison results in this feature displays shared segments of DNA on separate lines representing each chromosome pair, and labels the shared segments as Half IBD, or identical by descent. Because you inherit one half of your DNA from your mother and the other half from your father, IBD segments typically occur on only a single chromosome. Half IBD refers to the amount of the genome in centiMorgans (cM) that contains an IBD segment on either chromosome. The percent DNA shared in DNA Relatives is based on this number.

Your half IBD and shared segments vary based on the closeness of your relationship with the matches with whom you are comparing. Closer relatives will share thousands of cM and many segments in common; more distant relatives may share only one. For some of your shares, if you connected outside of DNA Relatives, you may not share any segments at all.

Every time DNA is passed from one generation to the next, the two chromosomes in each pair are randomly shuffled with each other in a process called recombination. Then, only half of this new DNA — one set of chromosomes — is passed down to each child. The total amount of DNA passed down from an ancestor is cut approximately in half each generation. Through this process, long inherited segments are broken up generation by generation into multiple shorter ones and sometimes lost altogether.

Despite all of this generational shuffling, DNA Relatives is highly sensitive and can pick up matches ranging from siblings and uncles to distant eighth cousins — individuals that share great-great-great-great-great-great-great grandparents with you. It may not always be obvious how you share a connection with someone, but that’s where our DNA Relatives tool comes in. Visit the tool to find out more about your matches and get in touch to learn about your family history.

See our Customer Care pages for more information:

Shared segments between cousins

Inheritance family tree graphic.

A closer look at the matching segment

An example graphic showing a matching segment between you and your cousin.

Sources of information used in this report

The Your DNA Family report provides aggregated summaries of several attributes of your DNA Relatives. The following information sources are used in the report:

Report Section
Close to distant DNA Relatives

Computed IBD results from the DNA Relatives tool.

Locations of your DNA Relatives

Answers to survey questions by your DNA Relatives.

Ancestries of your DNA Relatives

Computed results from the Ancestry Composition report.

Traits and behaviors in your 23andMe DNA Family

Answers to survey questions by your DNA Relatives.

Traits in Your DNA Family

Mitochondrial Eve (mtDNA)

Mitochondrial Eve

From Wikipedia, the free encyclopedia
Haplogroup Modern humans
Early diversification.PNG
Possible time of origin c. 100–230 kya[1]
Possible place of origin East Africa
Ancestor n/a
Descendants Mitochondrial macro-haplogroups L0, L1, and L5
Defining mutations None

In human genetics, the Mitochondrial Eve (also mt-Eve, mt-MRCA) is the matrilineal most recent common ancestor (MRCA) of all currently living humans, i.e., the most recent woman from whom all living humans descend in an unbroken line purely through their mothers, and through the mothers of those mothers, back until all lines converge on one woman. Mitochondrial Eve lived later than Homo heidelbergensis and the emergence of Homo neanderthalensis, but earlier than the out of Africa migration,[2] but her age is not known with certainty; a 2009 estimate cites an age between c. 152 and 234 thousand years ago (95% CI);[3] a 2013 study cites a range of 99–148 thousand years ago.[4]

Because mitochondrial DNA (mtDNA) is almost exclusively passed from mother to offspring without recombination (see the exception at paternal mtDNA transmission), most mtDNA in every living person differs only by the mutations that have occurred over generations in the germ cell mtDNA since the conception of the original “Mitochondrial Eve”.

The male analog to the Mitochondrial Eve is the Y-chromosomal Adam, the member of Homo sapiens sapiens from whom all living humans are patrilineally descended. Rather than mtDNA, the inherited DNA in the male case is the nuclear Y chromosome. Mitochondrial Eve and Y-chromosomal Adam need not have lived at the same time.[5]

As of 2013, estimates for mt-MRCA and Y-MRCA alike are still subject to substantial uncertainty; thus, Y-MRCA has been estimated to have lived during a wide range of times from 180,000 to 581,000 years ago[6][7][8] (with a most likely age of between 120,000 and 156,000 years ago, roughly consistent with the estimate for mt-MRCA[4][9]).

The name “Mitochondrial Eve” alludes to biblical Eve.[10] This has led to repeated misrepresentations or misconceptions in journalistic accounts on the topic. The title of “Mitochondrial Eve” is not permanently fixed to a single individual, but rather shifts forward in time over the course of human history as maternal lineages become extinct. Unlike her biblical namesake, she was not the only living human female of her time. However, by the definition of Mitochondrial Eve, her female contemporaries, though they may have descendants alive today, do not have any descendants today who descend in an unbroken female line of descent.

Ancestry.com Genetic Communities




Genetic Communities™ White Paper: Predicting fine-scale ancestral origins from the genetic sharing patterns among millions of individuals

Catherine A. Ball, Erin Battat, Jake K. Byrnes, Peter Carbonetto, Kenneth G. Chahine, Ross E. Curtis, Eyal Elyashiv, Ahna Girshick, Julie M. Granka, Harendra Guturu, Eunjung Han, Ariel Hippen Anderson, Eurie Hong, Amir Kermany, Natalie M. Myres, Keith Noto, Kristin A. Rand, Shiya Song, Yong Wang(in alphabetical order).

  1. Introduction

Section 1.0

AncestryDNA™ offers several genetic analyses to help customers discover, preserve, and share their family history. Some of the features offered to date are based exclusively on genetic information. These include a genetic ethnicity or ancestry inference (described in Ethnicity Estimate White Paper) and an identity-by-descent (IBD) or DNA matching analysis (Matching White Paper). Other features, like DNA Circles, rely on the integration of pedigree and IBD data across the entire AncestryDNA database (DNA Circles White Paper). Each of these features provides complementary information to an ancestryDNA member: (1) the ethnicity estimate provides a distant picture of a customer’s genetic origins, perhaps hundreds or thousands of years ago; (2) DNA matches provide a customer with a list of fellow AncestryDNA test-takers who are relatives and with whom she or he shares a common ancestor within the last 10 generations; (3) DNA Circles integrate IBD and pedigree data to provide a customer with groups of relatives that appear to share DNA with one another due to a specific shared ancestor, to potentially reinforce their connection to this ancestor. In combination, these features provide a detailed portrait of an individual’s genetic ancestry.

Here, we augment these DNA and pedigree-based insights even further with our new genetic communities feature (Figure 1.1). Instead of considering the IBD connection between each pair of customers in isolation, we simultaneously analyze more than 20 billion connections identified among over 2 million AncestryDNA customers as a large genetic network (described in Section 3). Intuitively, because the estimated IBD connections between individuals are likely due to recent shared ancestry (within the past 10 generations), broader patterns in this large network likely represent recent shared history. The result is that we can identify clusters of living individuals that share large amounts of DNA due to specific, recent shared history. For example, we identify groups of customers that likely descend from immigrants participating in a particular wave of migration (e.g. Irish fleeing the Great Famine), (Insert: Duke B. Montgomery, Genetic Genealogist,  force migration of African and South American Indians and enslavement of Native Americans), or customers that descend from ancestral populations that have remained in the same geographic location for many generations (e.g. the early settlers of the Appalachian Mountains and Blue Ridge Mountains). Following the identification of these clusters of individuals in the entire network, we can then assign any AncestryDNA customer to one or more of these clusters based on their IBD with other AncestryDNA members. Such assignment can provide a customer with insight into their recent ancestral history, in some cases traceable back to a historical event.

Figure 1.1

In the following coming sections, (Section 2) I will turn away for a moment to describe haplogroups which is specific to African and African-Americans, their geographical locations and population. Example of two families, one in the Blue Ridge Mountains of Virginia and the other living in two places in Henderson and Sayersville, Kentucky. You will be able to use this data to look at your own haplogroup. Genetic Communities is fine as long as you can apply it to yourself and to your family research. After that, we will turn back to the scientific principles behind the genetic network (Sections 3 and 4), how Ancestor identify clusters within it (Sections 5 and 6), their use of DNA and pedigree data to annotate these clusters (Section 7), and finally our method for assigning customer samples to these clusters (Section 8).


Discover Your Roots with an African DNA Test, Why Test and What test does what for you?

Discover Your Roots


After many hours and days exploring the literature on DNA testing and the pros and cons to testing, not to test, what to test for, what companies are best to meet your needs, all the testing possibilities, procedures, logarithms used by each company, results in simple terms you can understand, were to place your raw data after the test and what companies will secure your information it seemed appropriate to blog this to you.

Many African-Americans and others are using an African DNA test to get answers about their ethnic ancestry.

Typical questions include the following:

  • How much of my genetic heritage is African?
  • What regions of Africa do my ancestors come from?
  • Where does the remainder of my heritage come from?
  • Is my African ancestry from my father’s lineage or my mother’s?
  • Do my physical features reflect African ancestry or something else?

Fortunately, there are several reasonably priced African DNA tests that answer these and other questions about one’s ethnic ancestry.

The tests all use home test kits and sample collection is easy and painless. Depending on which company you use, you might wipe some cells from inside your cheek with a little swab or spit some saliva into a tube. No blood is required.

Here are my top seven recommendations for anyone interested in an African DNA test.

1. Ancestry DNA

AncestryDNA (http://www.ancestry.com) recently rose to the top of this list. Both men and women can take the test and it will identify other people in the database who share common ancestors with you. It is an autosomal test similar in technology to Family Finder (http://www.ftdna.com) and 23andMe (http://www.23andMe.com), discussed below.

The test includes an Ethnicity Estimate that summarizes the percentage contributions of different regions of the world to your overall ancestry. That estimate now breaks African Ancestry into nine regions:

  • Africa North
  • Senegal
  • Ivory Coast / Ghana
  • Benin / Togo
  • Cameroon / Congo
  • Mali
  • Nigeria
  • Africa Southeast Bantu
  • Africa South-Central Hunter-Gatherers

This is the first widely recognized, legitimate DNA test to provide this detailed a breakdown of African ancestry

2. Family Finder, which includes Population Finder

Family Finder is an autosomal DNA test from Family Tree DNA. It’s widely used by genealogists, including those interested in African American genealogy.

The company will compare your DNA against a database of other users to find genetic matches. Most often these genetic matches will be cousins, having a common ancestor with you somewhere in the last five or so generations.

By emailing your matches you can connect with previously unknown relatives and learn much more about your family tree.

As part of the Family Finder test, you receive a myOrigins report, formerly called Population Finder, where the company compares your DNA with over 60 reference populations from around the world. This is a biogeographical analysis of the DNA you received from ALL of your ancestors.

The African part of your DNA may place you in any of four sub continental groups based on similarities to certain scientifically studied populations. The groups and populations are as follows:

  • Central African: Biaka Pygmy, Mbuti Pygmy
  • East African: Bantu (Kenya)
  • Southern African: Bantu (South Africa), San
  • West African: Mandenka, Yoruba

Very few people outside Africa are 100% African. Population Finder will classify the remaining portion of your ancestry using other populations.

3. Y-DNA Test at Family Tree DNA

Family Tree DNA also offers a Y-DNA test, which tracks your paternal line. Since only men have a Y-chromosome, only men can take this test. But women can still test a man from their paternal line, e.g. a brother, a father, a brother of your father, or a son of your father’s brother.

Like Family Finder, this test finds genetic matches who share a common ancestor. But with the Y-DNA test you know the common ancestor has to be a male in the direct paternal line like your father’s father’s father etc.

The Y-DNA test will also predict a man’s Y-DNA haplogroup. And many haplogroups are clearly tied to origins in sub-Saharan Africa. This is the real indicator of your paternal line’s ethnic ancestry.

TIP: If you’re interested in finding genetic matches, you should order the Y-DNA 37 test, which checks 37 markers. But if you’re only interested in determining your haplogroup, you only need 12 markers. I suggest you go to Family Tree DNA and look for the combination package of Family Finder plus Y-DNA 12. The combo price is an excellent buy.

If you later decide that you want to discover your precise position in the Y-DNA tree of life, you can upgrade to more markers or even order a Deep Clade test. That will tell you exactly which subclade of your haplogroup you’re in. In many cases this can tighten the geographic origins of your paternal line.

4. mtDNA Test at Family Tree DNA

Both men and women have mitochondrial DNA (mtDNA) to test. But only women pass it on to their children. So mtDNA is the test to track your maternal line. That’s your mother’s mother’s mother etc.

As with the test described previously, you will probably see matches with other users. But mtDNA mutates so slowly that your common ancestors may have lived thousands of years ago. That makes mtDNA less useful than Y-DNA as a genealogy tool.

Still, mtDNA also has a haplogroup that relates directly to the origins of your maternal line. And some of those are clear indicators of African origin.

5. 23andMe Which Includes Ancestry Composition

23andMe is another autosomal DNA test like Family Finder. This test can also serve as an African DNA test, because it has an Ancestry Composition feature that tells you what parts of the world your ancestors lived a few hundred years ago.

This admixture report is similar to the Population Finder feature of the Family Finder test. It reports on African Ancestry from these three regions:

  • West African
  • East African
  • Central and South African

However, if you also test at least one of your parents on 23andMe, this test can split your ancestral percentages into your paternal and maternal sides.

23andMe also has a DNA Relatives feature that’s similar to Family Finder and it will estimate your Y-DNA and mtDNA haplogroups. So if you want to cover all your bases—then the 23andMe test can be a great value as an African DNA test.

6. Y-DNA and mtDNA Testing at African DNA

Harvard University Professor Henry Louis Gates, Jr. was a pioneer in African DNA testing. He founded African DNA to encourage more African-Americans to get their DNA tested.

The company offers a Y-DNA test of 25 markers and an mtDNA test like the mtDNA Plus test at Family Tree DNA. In fact, Family Tree DNA is affiliated with the company and does their DNA testing.

Now they can also offer the Family Finder test that they renamed Ancestry Finder.

Note that African DNA only offers one paternal line Y-DNA test and one maternal line mtDNA test. They do not offer additional Y-DNA markers, the Full Mitochondrial Sequence (FMS) test, or Deep Clade testing. You need to order those tests directly from Family Tree DNA.

The African DNA web site does have more content specific to African DNA testing than any of the more general DNA testing companies. So I encourage anyone looking for an African DNA test to visit the site and learn all you can.

Uniquely, African DNA does offer some higher priced packages that combine DNA testing with genealogy research to build your family tree.

For most African-Americans there are no genealogical records prior to the 1870 census, when last names of former slaves began to be recorded. If you want someone to build a few generations of your family tree, however, this is an option to consider.

MONEY-SAVING TIP: If you’re not ordering a package with genealogy research, be sure to recheck Family Tree DNA to compare prices before placing an order with African DNA. At the time of this writing, you can order the same Y-DNA and mtDNA tests directly through Family Tree DNA for significantly less money.

7. Y-DNA and mtDNA Testing at African Ancestry

African Ancestry is another company that specifically features African DNA tests. Like the companies above, they check your Y-DNA and mtDNA to determine your paternal and maternal lineages. Since their web site does not provide details of either test, I cannot compare them.

Unlike Family Tree DNA, they do not keep a database of customer results, so you will not receive any matches to people with similar DNA. Since the company does not have an autosomal test like Family Finder and 23andMe, it cannot provide any admixture percentages. You won’t learn anything about ancestors outside your narrow paternal and maternal lines.

I found some interesting data on the web site. Even though this site specifically attracts people of African descent, 35% of the paternal line tests show European ancestry. Much of that non-African DNA was introduced into the family tree during the era of slavery. In addition, 8% of their maternal line samples show non-African haplogroups.

An article in the Wall Street Journal was critical of the African DNA test reports provided by this company. Independent experts say that mitochondrial DNA is not sufficient to nail down an ancestor’s origin to a specific country.

Furthermore, the large migrations of Africans over the last 3,000 years means that the typical black American’s DNA will match Africans living today in several countries. Even the founder of African DNA was quoted in the article that the country-specific reports his company provides are largely a “best guess.”

The testing prices at African DNA are higher than those of the companies listed above. Even if you have your African DNA test done elsewhere, the African Ancestry web site includes some interesting information on African heritage and a list of country-by-country resources in Africa for genealogists.

Other African DNA Tests of Uncertain Quality

DNA Tribes uses autosomal markers representing all your ancestors. But unlike AncestryDNA, Family Finder and 23andMe, which check nearly a million autosomal SNPs, DNA Tribes checks a maximum of 27 STRs.

I won’t try to explain the difference between an STR and a SNP h//ere. But autosomal STRs are what police forces around the world have been collecting from criminals for decades.

The company examined 383,000 STR records and claims to have identified major genetic regions around the world. They compare your DNA with their proprietary database and issue reports on your most closely matched regions.

The company does not share its database or reveal its methods. And independent experts are skeptical when such detailed reports arise from so few markers.

Roots for Real offers Y-DNA, mtDNA, and an autosomal test based on 16 STR markers. They position their autosomal admixture test as an African DNA test. But their database is only about one third the size of the already questionable DNA Tribes test. And all of their tests are overpriced compared to market leaders Family Tree DNA, 23andMe, and Ancestry.com.

Warning: These tests are based on sound science. But if you don’t know exactly what you’re doing, you can take the wrong test for your situation. It’s also easy to pay too much…settle for incomplete data…or misinterpret the results.

My test:

Y-DNA 111 markers, mtDNA + Plus, Full Sequence DNA and Family Finder DNA

My Haplogroup: Y-DNA (E-M2) and mtDNA (L2a1a2




http://www.23andme.com/DNA assess 2/10/2017

http://www.ancestry.com/DNA assess 2/10/2017

http://www.myheritage.com/DNA assess 2/12/2017

http://www.africanancestry.com/home/ assess 2/04/2017

https://www.africandna.com/ assess 2/04/2017


Megan Smolenyak Smolenyak and Ann Turner, “Trace Your Roots with DNA” published 2004

Blaine Bettinger, “The Family Tree Guide to DNA Testing and Genetic Genealogy” published 2016

%d bloggers like this: