Tuesday, March 8, 2016

Do We Need To Be Scientists To Use Our DNA Results?

No, I don't think we do need to be scientists to work with our segment data. We don't have access to actual data to scientifically analyze our own results as far as coming up with IBD statistics anyway. We can look at population genetics, but these scientists haven't evaluated the results from the new large genetic genealogy databases. They can't tell us what the likelihood the 15 cM segment we share with a match is from our common 4 x great-grandparents.

As genetic genealogy testing consumers we must rely on the data analysis the companies are providing us with. Companies like AncestryDNA are putting out statistics like these:

To me a 99% probability is good? This would imply that my Aunt's 5th cousin match sharing a 20 and 15.7 cM segment more than likely shares IBD segments? The 99% probability means they likely share an ancestor within the past 6 generations. If we take away 10 cM's  for phasing, timber etc. we are still in the 99% probability range.

Here I've crossed out lines that are not Colonial American in my Aunt Loretta's tree. Most of these ancestors came to the US at the turn of the 20th Century. The only line that would correspond with this match would be through Mary Owens born 1852. Her ancestors were all Colonial Americans. Aunt Loretta's 5th cousin match has mostly Colonial Virginian ancestry. I haven't traced any lines expect one back to Colonial Virginia. I have my Aunt's lines back 6 generations, and more, except in Ireland. I feel confident after examining the tree of our match, and my Aunt's tree, that any matching segments more likely than not match to our Colonial American ancestors.

If we find a triangulation on one of these segments, with good overlap, and that happened to be a 5th cousin match, I don't understand why that would be suspect either? As some would say. They would agree with this statement by Ancestry "In populations that have grown rapidly in the past 200 to 300 years, individuals are more likely to be related to each other through two or more ancestral couples. Such population growth may also lead to marriages between related individuals, such as second cousins. As in the founder effect, this also leads to an excess of DNA sharing, but due to multiple common ancestors living one to two hundred years ago, rather than thousands of years ago." The triangulation naysayers would say how can you tell where such a segment comes from if you might share several ancestral lines? Maybe you can use segment mapping or build out your tree as far as possible? Some are also saying that segments shared between those with many Colonial American lines are the result of endogamy. Read Ancestry's statement again. They are saying they aren't seeing the founders effect?

At AncestryDNA the lack of segment data makes it impossible for us to even try to decide which lines, out of maybe a couple possibilities, we might match on. You don't have to be a scientist to map segments.

I think most of the experts in the genetic genealogy field would agree with this statement from Ancestry:"The longer the stretches of evidence for identical haplotypes the more evidence there is that the identity is due to a recent common ancestor."

 From Ancestry "Our test set contains over 150 genotyped samples from a large family with a well-researched pedigree containing about 2,500 relationships that vary from 1 meiosis to 15 meioses. In order to estimate recall, we must know whether a given pair of individuals has IBD." Ancestry compared 150 samples to find IBD, so we can find IBD by comparison. Wonder if they had Colonial American ancestors? I realize they only used this test set for finding IBD in close relatives, but this would seem to suggest mapping would help in establishing a relationship with more distant relatives.

We need the studies carried out by Ancestry and the other companies to guide us as to whether our matches have a good chance of sharing ancestors within the past 6 generations. We need good statistics regarding the likelihood a segment is IBD. Personally if a segment has a good chance of being IBD and I share a set of common ancestors with a match I have little doubt where the segment came from.

Reading Ancestry's white paper for matches did give me some pause for thought...
"third cousins, in fact, are only about 98% likely to have any IBD." Only? That's a pretty high percentage to me? That statement is a little troubling. I hope the rest of the reasoning behind their analysis is better?

Given the chance I think non scientists, provided with segment IBD probabilities and good trees, can make valid connections with matches


Yaniv Erlich said...

Hi Annette,

The math whether a segment is recent (less than ~7 generations) or ancient is quite complex and requires advanced stats. It is not something that can be done with a pen and paper.

DNA.Land (https://dna.land) relative matching algorithm has the capability to do calculate this stats and classify segments to recent or ancient (see: https://dna.land/relative-finder-info). The website is free, not-for-profit, and accepts DNA form all three major companies. You can upload your data.

Kalani said...

Although I have the genealogies of the people of Pitcairn beginning from the founding 12 people, after comparing a couple of descendants to an actual Pitcairn resident, it was just impossible to map out the segments. More testees would be needed and not to mention relatives from the founding (European) people. Impossible to get relatives of the women since they were Tahitian women who may or may not have been closely related. And all 3 of these people I compared, as admixed as they all may be, especially the two non-Pitcairn residents, they are ALL coming up as matches to other Polynesians.


Annette said...

I've uploaded raw data to DNA.Land. It would be wonderful if more people would do the same.
Sounds like autosomal DNA testing isn't very useful for endogamous Island populations.