|For several years, GEDmatch has provided genetic genealogists, both beginners and experts, the ability to search for matches among kits in their database without regard to vendor. Also, GEDmatch has provided a rich suite of analysis programs allowing users to dig deeply into the genetic details of their matches, enhance the reports from their vendors, and even pursue their own original research ideas. Our algorithms are evolving to extract the most trustworthy and meaningful matching information possible using the markers common to pairs of kits even though sometimes limited.
Unfortunately, all too often, kits appear to share a DNA segment purely by chance. To combat this confusing phenomenon, we recently have developed a reliability measure that allows users to assess the quality of a matching segment in an intuitively appealing fashion. We also use the measure to guide our matching algorithms as they wring the greatest amount of useful information possible from the markers common to pairs of kits.
If we could assume that marker characteristics were uniform in all regions within chromosomes, we could use a “one size fits all” requirement for matching segments as is sometimes done. Unfortunately, the relevant characteristics vary widely. Some long segments with few markers may be accidental matches. Some marker rich short segments are often discarded although they are profoundly non-random.
Using the characteristics of each and every marker in a segment, we compute the expected number of purely chance matches to it to be found in the database. That number is then used to classify the segment into one of several levels reflecting the likelihood that the random matches may overwhelm the real ones. When a user executes a one-to-many search or a one-to-one comparison specifying a minimum segment length, the display can then include an estimate of validity for each segment found.
One can assume those segments designated to be valid are the result of a DNA inheritance process rather than mere chance. Questions may still remain about how far back shared DNA originates, but a confounding factor has been removed.