Centimorgans in Genetic Geealogy

Reprinted from the International Society of Genetic Genealogy August 2, 2017. No adjustment was made to this article and is the ISOGG position.

 

In genetic genealogy, a centiMorgan (cM) or map unit (m.u.) is a unit of recombinant frequency which is used to measure genetic distance. It is often used to imply distance along a chromosome, and takes into account how often recombination occurs in a region. A region with few cMs undergoes relatively less recombination. The number of base pairs to which it corresponds varies widely across the genome (different regions of a chromosome have different propensities towards crossover). One centiMorgan corresponds to about 1 million base pairs in humans on average. The centiMorgan is equal to a 1% chance that a marker at one genetic locus on a chromosome will be separated from a marker at a second locus due to crossing over in a single generation.

The genetic genealogy testing companies 23andMeAncestryDNAFamily Tree DNA and MyHeritage DNA use centiMorgans to denote the size of matching DNA segments in autosomal DNA tests. Segments which share a large number of centiMorgans in common are more likely to be of significance and to indicate a common ancestor within a genealogical timeframe.

The centiMorgan was named in honor of geneticist Thomas Hunt Morgan by his student Alfred Henry Sturtevant. Note that the parent unit of the centiMorgan, the Morgan, is rarely used today.

23andMe and Family Tree DNA both use HapMap to infer their centiMorgans.

centiMorgans vs megabases

CentiMorgans are interpolated numbers that take into consideration each area of a chromosome and its propensity to recombine. This means if two cousins share 40 cM on chromosome 1, and two different cousins share 40 cM on chromosome 5, they both can be predicted to share a certain degree of relationship statistically. Megabases vary slightly in different locations so that in the same scenario, if both sets shared 40 Mb pairs, it would be more difficult to ensure they are of a similar degree of relation without further accounting for location, chromosome and other factors.[1]

Ann Turner provides a useful explanation: “I think of the cM as being a unit of ‘effective’ distance. As an analogy, a mile is a fixed quantity (5280 feet), and so are megabases. But the probability that a person can walk a mile in 20 minutes is more fluid. If the terrain is very rough, the “effective” distance of a literal mile might be more like two miles if you’re trying to arrive at a certain time. We’re more interested in the probability that a segment will be passed on intact than the size of the segment in Mb”.[2]

As the cM is an empirical measure, based on recombination events in a particular dataset of parents and offspring, it can vary somewhat from study to study. This set of maps for each chromosome shows that the general shape of the centiMorgan vs megabase curve is similar for two datasets, but the absolute values are not quite the same:

http://web.archive.org/web/20070113005025/http://compgen.rutgers.edu/maps/compare.pdf

cm values per chromosome

The following table compares cM values per chromosome at Family Tree DNAGEDmatch, and 23andMeAncestryDNA uses 3475 as the total cM according to the help screen for confidence level in a DNA match. This presumably excludes the X chromosome.

CM chromosome FTDNA&GEDMatch&23andMe.jpg

Probability of crossover

The following chart shows the estimated probability that a segment will be affected by a crossover. The chart does not take into account some variables such as inversions and different recombination rates for males and females.

Crossover probability centiMorgans.png

Converting centiMorgans into percentages

In order to get an approximate percentage of shared DNA from a Family Tree DNA Family Finder test, take all of the segments above 5 cM, add them together and then divide by 68.

The way the calculation works is that your total genome in cMs with the Family Finder test is 6770 cM. A half-identical match (such as a parent/child) is 3385 cM. This number has to be doubled to represent both the maternal and paternal sides giving a total of 6770 cM. Matt Dexter explains: “The reason the number is not 6770 or 6800, but rather 68, is that it saves an additional step doing the math to convert an answer to percent. For example, 3385 / 6770 = .5 then as a second step, .5 times 100 = 50%. Using 68 to start with saves the added math step. So (3385 / 6800) * 100 is the same thing as 3385 / 68, which results in = 50%.”[3]

Human reference genome

The centiMorgan totals per chromosome are based on the Human Reference Genome. 23andMe and Ancestry DNA use Build 37. Family Tree DNA use Build 37 for matching but Build 36 for segment boundaries in the Chromosome Browser. Raw data files are provided in both formats. Build 37 filled in quite a few gaps, and the number of base pairs in each of the chromosomes was longer in Build 37 as compared to Build 36. Consequently the cM totals per chromosome are lower for Family Finder than they are for 23andMe. GedMatch use Build 36, and convert AncestryDNA and 23andMe data from Build 37 to Build 36 for backward compatibility.

The latest version of the Human Reference Genome, Build 38, was released in December 2013. However, none of the companies have as yet adopted Build 38 and there is a “gentleman’s agreement” in place to stick with Build 37 for the present time.

Further reading

Resources

Advertisements

DNA Triangulation, What?

Triangulation is a term derived from surveying to describe a method of determining the Y-STR or mitochondrial DNA ancestral haplotype using two or more known data points. The term “Genetic Triangulation” was coined by genetic genealogist Bill Hurst in 2004 Triangulate

Here is a 3-step process for Triangulation: Collect, Arrange, Compare/Group.

  1. Collect all the Match-segments you can. I recommend testing at all three companies (23andMe, FTDNA, and AncestryDNA), and using GEDmatch. But, wherever you test, get all of your segments into a spreadsheet. If you are using more than one company, you need to download, and then arrange, the data in the same format as your spreadsheet. Downloading/arranging is best when starting a new spreadsheet. Downloading avoids typing errors, but direct typing is sometimes easier for updates. I recommend deleting all segments under 7cM – most of them will be IBC/IBS (false segments) anyway, and even the ones which may be IBD are very difficult to confirm as such. You are much better off doing as much Triangulation as you can with segments over 7cM (or use a 10cM threshold if you wish), and then adding smaller segments back in later, if you want to analyze them. NB: Some of your closer Matches will share multiple segments with you – each segment must be entered as a separate row in your spreadsheet. The minimum requirement for a Triangulation with a spreadsheet includes columns for MatchName, Chromosome, SegmentStartLocation, SengmentEndLocation, cMs and TG. Most of us also have columns for SNPs, company, testee, TG, and any other information of interest to you. Perhaps I need a separate blog post about spreadsheets… ;>j
  1. Arrange the segments by sorting the entire spreadsheet (Cntr-A) by Chromosome and Segment StartLocation. This is one sort with two levels – the Chromosome column is the first level. This puts all of your segments in order – from the first one on Chromosome 1 to the last one on Chromosome 23 (for sorting purposes I recommend changing Chromosome X to 23 or 23X so it will sort after 22). This serves the purpose of putting overlapping segments close to each other in the spreadsheet where they are easy to compare.
  1. Compare/Group overlapping segments. All of these segments are shared segments with you. So with segments that overlap each other, you want to know if they match each other at this location. If so this is Triangulation. This comparison is done a little differently at each company, but the goal is the same: two segments either match each other, or they don’t (or there isn’t enough overlapping segment information to determine a match). All the Matches who match each other will form a Triangulated Group, on one chromosome – call this TG A (or any other name you want). Go through the same process with the segments who didn’t match TG A. They will often match each other and will form a second, overlapping TG, on the other chromosome – call this TG B. [Remember you have two of each numbered chromosome.] So to review, and put it all a different way: All of your segments (every row of your spreadsheet) will go into one of 4 categories:
  • – TG A [the first one with segments which match each other]
  • – TG B [the other, overlapping, one with segments which match each other]
  • – IBC/IBS [the segments don’t match either TG A or TG B]
  • – Undetermined [there are not enough segments to form both TG A and TG B                            and/or there isn’t enough overlapping data to determine a match.]
  • NB: None of the segments in TG A should match any of the segments in TG B.
  1. At GEDmatch – the comparisons are easy. Just compare two kit numbers using the one-to-one utility to see if they match each other on the appropriate segment. The ones that do are Triangulated. You may also use the Tier1 Triangulation utility or the Segment utility. I prefer using the one-to-one utility and Chrome.
  1. At 23andMe you have several different utilities:
  • – Family Inheritance: Advanced lets you compare up to 5 Matches at a time. You may also request a spreadsheet of all your shared segments; sort that by chromosome and SegmentStart, and check to see if two of your Matches match each other. The ones that do are Triangulated.
  • – Countries of Ancestry: Sort a Match’s spreadsheet by chromosome and SegmentStart, search for your own name, and highlight the overlapping segments. The Matches on this highlighted list who are also on overlapping segments in your spreadsheet are Triangulated (the CoA spreadsheet confirms the match between two of your Matches)
  1. At FTDNA it’s a little trickier, because they don’t have a utility to compare two of your Matches. So the most positive method is to contact the Matches and ask them to confirm if they match your overlapping Matches, or not. The ones that do are Triangulated. An almost-as-good alternative is to use the InCommonWith utility. Look for the 2-squigley-arrows icon next to a Match’s name, click that, and select In Common With to get a list of your Matches who also match the Match you started with. Compare that list of Matches with the list of list of Matches with overlapping segments in your spreadsheet. Matches on both lists are considered to be Triangulated. Although this is not a foolproof method, it works most of the time. And if you find three or four ICW Matches in the same TG, the odds are much closer to 100%. Remember, every segment in your spreadsheet must go in one TG or the other, or be IBC/IBS, or be undetermined. If a particular Match, in one TG, is critical to your analysis, then try hard to confirm the Triangulation by contacting the Matches.
  1. AncestryDNA has no DNA analysis utilities. You need to convince your Matches to upload their raw data to GEDmatch (for free) or FTDNA (for a fee), and see the paragraphs above.

Comments to improve this blog post are welcomed.

How 23andMe identifies your DNA Relatives

Scientific Details

Learn about your DNA Relatives, the diverse group of 23andMe customers who have DNA in common with you. Only individuals who have chosen to participate in DNA Relatives are the subject of this report.

How 23andMe identifies your DNA Relatives

To identify your DNA Relatives, we use an algorithm that finds segments of your DNA that are identical to DNA segments of other 23andMe customers. When these segments are sufficiently long, we infer that they were inherited from a recent common ancestor. These segments are known as “identical by descent,” or IBD. Our algorithm searches for these matches across virtually your entire genome, so we can identify DNA Relatives on any branch of your family tree.

Note: IBD/Half IBD:

The comparison results in this feature displays shared segments of DNA on separate lines representing each chromosome pair, and labels the shared segments as Half IBD, or identical by descent. Because you inherit one half of your DNA from your mother and the other half from your father, IBD segments typically occur on only a single chromosome. Half IBD refers to the amount of the genome in centiMorgans (cM) that contains an IBD segment on either chromosome. The percent DNA shared in DNA Relatives is based on this number.

Your half IBD and shared segments vary based on the closeness of your relationship with the matches with whom you are comparing. Closer relatives will share thousands of cM and many segments in common; more distant relatives may share only one. For some of your shares, if you connected outside of DNA Relatives, you may not share any segments at all.

Every time DNA is passed from one generation to the next, the two chromosomes in each pair are randomly shuffled with each other in a process called recombination. Then, only half of this new DNA — one set of chromosomes — is passed down to each child. The total amount of DNA passed down from an ancestor is cut approximately in half each generation. Through this process, long inherited segments are broken up generation by generation into multiple shorter ones and sometimes lost altogether.

Despite all of this generational shuffling, DNA Relatives is highly sensitive and can pick up matches ranging from siblings and uncles to distant eighth cousins — individuals that share great-great-great-great-great-great-great grandparents with you. It may not always be obvious how you share a connection with someone, but that’s where our DNA Relatives tool comes in. Visit the tool to find out more about your matches and get in touch to learn about your family history.

See our Customer Care pages for more information:

Shared segments between cousins

Inheritance family tree graphic.

A closer look at the matching segment

An example graphic showing a matching segment between you and your cousin.

Sources of information used in this report

The Your DNA Family report provides aggregated summaries of several attributes of your DNA Relatives. The following information sources are used in the report:

Report Section
Source
Close to distant DNA Relatives

Computed IBD results from the DNA Relatives tool.

Locations of your DNA Relatives

Answers to survey questions by your DNA Relatives.

Ancestries of your DNA Relatives

Computed results from the Ancestry Composition report.

Traits and behaviors in your 23andMe DNA Family

Answers to survey questions by your DNA Relatives.

Traits in Your DNA Family

Discover Your Roots with DNA

Source: DNA Testing Advisor (www.dna-testing-adviser.com/african-dna-test) Access on May 18, 2017

Discover Your Roots
with an African DNA Test

African Outline

Many African Americans and others are using an African DNA test to get answers about their ethnic ancestry.

Typical questions include the following:

  • How much of my genetic heritage is African?
  • What regions of Africa do my ancestors come from?
  • Where does the remainder of my heritage come from?
  • Is my African ancestry from my father’s lineage or my mother’s?
  • Do my physical features reflect African ancestry or something else?

Fortunately, there are several reasonably-priced African DNA tests that answer these and other questions about one’s ethnic ancestry.

The tests all use home test kits and sample collection is easy and painless. Depending on which company you use, you might wipe some cells from inside your cheek with a little swab or spit some saliva into a tube. No blood is required.

Here are my top seven recommendations for anyone interested in an African DNA test.

1. Ancestry DNA

AncestryDNA recently rose to the top of this list. Both men and women can take the test and it will identify other people in the database who share common ancestors with you. It is an autosomal test similar in technology to Family Finder and 23andMe, discussed below.

The test includes an Ethnicity Estimate that summarizes the percentage contributions of different regions of the world to your overall ancestry. That estimate now breaks African Ancestry into nine regions:

  • Africa North
  • Senegal
  • Ivory Coast / Ghana
  • Benin / Togo
  • Cameroon / Congo
  • Mali
  • Nigeria
  • Africa Southeast Bantu
  • Africa South-Central Hunter-Gatherers

This is the first widely recognized, legitimate DNA test to provide this detailed a breakdown of African ancestry

2. Family Finder, which includes Population Finder

Family Finder is an autosomal DNA test from Family Tree DNA. It’s widely used by genealogists, including those interested in African American genealogy.

The company will compare your DNA against a database of other users to find genetic matches. Most often these genetic matches will be cousins, having a common ancestor with you somewhere in the last five or so generations.

By emailing your matches you can connect with previously unknown relatives and learn much more about your family tree.

As part of the Family Finder test, you receive a myOrigins report, formerly called Population Finder, where the company compares your DNA with over 60 reference populations from around the world. This is a biogeographical analysis of the DNA you received from ALL of your ancestors.

The African part of your DNA may place you in any of four sub continental groups based on similarities to certain scientifically studied populations. The groups and populations are as follows:

  • Central African: Biaka Pygmy, Mbuti Pygmy
  • East African: Bantu (Kenya)
  • Southern African: Bantu (South Africa), San
  • West African: Mandenka, Yoruba

Very few people outside Africa are 100% African. Population Finder will classify the remaining portion of your ancestry using other populations.

3. Y-DNA Test at Family Tree DNA

Family Tree DNA also offers a Y-DNA test, which tracks your paternal line. Since only men have a Y-chromosome, only men can take this test. But women can still test a man from their paternal line, e.g. a brother, a father, a brother of your father, or a son of your father’s brother.

Like Family Finder, this test finds genetic matches who share a common ancestor. But with the Y-DNA test you know the common ancestor has to be a male in the direct paternal line like your father’s father’s father etc.

The Y-DNA test will also predict a man’s Y-DNA haplogroup. And many haplogroups are clearly tied to origins in sub-Saharan Africa. This is the real indicator of your paternal line’s ethnic ancestry.

TIP: If you’re interested in finding genetic matches, you should order the Y-DNA 37 test, which checks 37 markers. But if you’re only interested in determining your haplogroup, you only need 12 markers. I suggest you go to Family Tree DNA and look for the combination package of Family Finder plus Y-DNA 12. The combo price is an excellent buy.

If you later decide that you want to discover your precise position in the Y-DNA tree of life, you can upgrade to more markers or even order a Deep Clade test. That will tell you exactly which subclade of your haplogroup you’re in. In many cases this can tighten the geographic origins of your paternal line.

4. mtDNA Test at Family Tree DNA

Both men and women have mitochondrial DNA (mtDNA) to test. But only women pass it on to their children. So mtDNA is the test to track your maternal line. That’s your mother’s mother’s mother etc.

As with the test described previously, you will probably see matches with other users. But mtDNA mutates so slowly that your common ancestors may have lived thousands of years ago. That makes mtDNA less useful than Y-DNA as a genealogy tool.

Still, mtDNA also has a haplogroup that relates directly to the origins of your maternal line. And some of those are clear indicators of African origin.

5. 23andMe Which Includes Ancestry Composition

23andMe is another autosomal DNA test like Family Finder. This test can also serve as an African DNA test, because it has an Ancestry Composition feature that tells you what parts of the world your ancestors lived a few hundred years ago.

This admixture report is similar to the Population Finder feature of the Family Finder test. It reports on African Ancestry from these three regions:

  • West African
  • East African
  • Central and South African

However, if you also test at least one of your parents on 23andMe, this test can split your ancestral percentages into your paternal and maternal sides.

23andMe also has a DNA Relatives feature that’s similar to Family Finder and it will estimate your Y-DNA and mtDNA haplogroups. So if you want to cover all your bases—then the 23andMe test can be a great value as an African DNA test.

6. Y-DNA and mtDNA Testing at African DNA

Harvard University Professor Henry Louis Gates, Jr. was a pioneer in African DNA testing. He founded African DNA to encourage more African Americans to get their DNA tested.

The company offers a Y-DNA test of 25 markers and an mtDNA test like the mtDNA Plus test at Family Tree DNA. In fact, Family Tree DNA is affiliated with the company and does their DNA testing.

Now they can also offer the Family Finder test that they renamed Ancestry Finder.

Note that African DNA only offers one paternal line Y-DNA test and one maternal line mtDNA test. They do not offer additional Y-DNA markers, the Full Mitochondrial Sequence (FMS) test, or Deep Clade testing. You need to order those tests directly from Family Tree DNA.

The African DNA web site does have more content specific to African DNA testing than any of the more general DNA testing companies. So I encourage anyone looking for an African DNA test to visit the site and learn all you can.

Uniquely, African DNA does offer some higher priced packages that combine DNA testing with genealogy research to build your family tree.

For most African-Americans there are no genealogical records prior to the 1870 census, when last names of former slaves began to be recorded. If you want someone to build a few generations of your family tree, however, this is an option to consider.

MONEY-SAVING TIP: If you’re not ordering a package with genealogy research, be sure to recheck Family Tree DNA to compare prices before placing an order with African DNA. At the time of this writing, you can order the same Y-DNA and mtDNA tests directly through Family Tree DNA for significantly less money.

7. Y-DNA and mtDNA Testing at African Ancestry

African Ancestry is another company that specifically features African DNA tests. Like the companies above, they check your Y-DNA and mtDNA to determine your paternal and maternal lineages. Since their web site does not provide details of either test, I cannot compare them.

Unlike Family Tree DNA, they do not keep a database of customer results, so you will not receive any matches to people with similar DNA. Since the company does not have an autosomal test like Family Finder and 23andMe, it cannot provide any admixture percentages. You won’t learn anything about ancestors outside your narrow paternal and maternal lines.

I found some interesting data on the web site. Even though this site specifically attracts people of African descent, 35% of the paternal line tests show European ancestry. Much of that non-African DNA was introduced into the family tree during the era of slavery. In addition, 8% of their maternal line samples show non-African haplogroups.

An article in the Wall Street Journal was critical of the African DNA test reports provided by this company. Independent experts say that mitochondrial DNA is not sufficient to nail down an ancestor’s origin to a specific country.

Furthermore, the large migrations of Africans over the last 3,000 years means that the typical black American’s DNA will match Africans living today in several countries. Even the founder of African DNA was quoted in the article that the country-specific reports his company provides are largely a “best guess.”

The testing prices at African DNA are higher than those of the companies listed above. Even if you have your African DNA test done elsewhere, the African Ancestry web site includes some interesting information on African heritage and a list of country-by-country resources in Africa for genealogists.

Other African DNA Tests of Uncertain Quality

DNA Tribes uses autosomal markers representing all your ancestors. But unlike AncestryDNA, Family Finder and 23andMe, which check nearly a million autosomal SNPs, DNA Tribes checks a maximum of 27 STRs.

I won’t try to explain the difference between an STR and a SNP here. But autosomal STRs are what police forces around the world have been collecting from criminals for decades.

The company examined 383,000 STR records and claims to have identified major genetic regions around the world. They compare your DNA with their proprietary database and issue reports on your most closely matched regions.

The company does not share its database or reveal its methods. And independent experts are skeptical when such detailed reports arise from so few markers.

Roots for Real offers Y-DNA, mtDNA, and an autosomal test based on 16 STR markers. They position their autosomal admixture test as an African DNA test. But their database is only about one third the size of the already questionable DNA Tribes test. And all of their tests are overpriced compared to market leaders Family Tree DNA, 23andMe, and Ancestry.com.

 

Mitochondrial Eve (mtDNA)

Mitochondrial Eve

From Wikipedia, the free encyclopedia
Haplogroup Modern humans
Early diversification.PNG
Possible time of origin c. 100–230 kya[1]
Possible place of origin East Africa
Ancestor n/a
Descendants Mitochondrial macro-haplogroups L0, L1, and L5
Defining mutations None

In human genetics, the Mitochondrial Eve (also mt-Eve, mt-MRCA) is the matrilineal most recent common ancestor (MRCA) of all currently living humans, i.e., the most recent woman from whom all living humans descend in an unbroken line purely through their mothers, and through the mothers of those mothers, back until all lines converge on one woman. Mitochondrial Eve lived later than Homo heidelbergensis and the emergence of Homo neanderthalensis, but earlier than the out of Africa migration,[2] but her age is not known with certainty; a 2009 estimate cites an age between c. 152 and 234 thousand years ago (95% CI);[3] a 2013 study cites a range of 99–148 thousand years ago.[4]

Because mitochondrial DNA (mtDNA) is almost exclusively passed from mother to offspring without recombination (see the exception at paternal mtDNA transmission), most mtDNA in every living person differs only by the mutations that have occurred over generations in the germ cell mtDNA since the conception of the original “Mitochondrial Eve”.

The male analog to the Mitochondrial Eve is the Y-chromosomal Adam, the member of Homo sapiens sapiens from whom all living humans are patrilineally descended. Rather than mtDNA, the inherited DNA in the male case is the nuclear Y chromosome. Mitochondrial Eve and Y-chromosomal Adam need not have lived at the same time.[5]

As of 2013, estimates for mt-MRCA and Y-MRCA alike are still subject to substantial uncertainty; thus, Y-MRCA has been estimated to have lived during a wide range of times from 180,000 to 581,000 years ago[6][7][8] (with a most likely age of between 120,000 and 156,000 years ago, roughly consistent with the estimate for mt-MRCA[4][9]).

The name “Mitochondrial Eve” alludes to biblical Eve.[10] This has led to repeated misrepresentations or misconceptions in journalistic accounts on the topic. The title of “Mitochondrial Eve” is not permanently fixed to a single individual, but rather shifts forward in time over the course of human history as maternal lineages become extinct. Unlike her biblical namesake, she was not the only living human female of her time. However, by the definition of Mitochondrial Eve, her female contemporaries, though they may have descendants alive today, do not have any descendants today who descend in an unbroken female line of descent.

%d bloggers like this: