Saturday, May 24, 2014

DNA News of the Week: Are those Cousin Predictions Accurate?

I registered for this.

23andme had its first G+ hangout this week, see the video here. It was an overview of the types of tests they perform on your sample and how to navigate their site. I now have a better understanding of how to navigate their website. I manage my cousin's account. The Q & A at the end raised some interesting questions. CeCe Moore is a genetic genealogist and is very knowledgeable on the subject; most of the hangout was devoted to her site overview. Her answers during the Q & A were very enlightening. The answer to the question about segment size and cM totals, for use in deciding which matches deserve further examination, sparked my interest. I have thousands of matches to review. That list needs pruning if I want to work through the list in my remaining lifetime. I tested with FTDNA in the summer of 2012. Before I read all of the instructions on interpreting results I took the predictions literally. After identifying ancestors out past the 5 generation cut off I actually read the instructions, and found out that remote 5th cousins can be related to you somewhere around 500 years ago.

According to the answer CeCe gave regarding how to pick out your most promising matches she recommended looking for matches who share multiple segments. She also said that large single segments can be very old ; she called them sticky segments. I have a number of matches who share single large segments and we can't find any connection. I assume these are sticky segments from hundreds of years ago. I went into my results and looked for someone who shared multiple segments. I found one with 6 shared segments. I took CeCe's advice on tailoring an email, and sent that person a query. I noticed that match had Tennessee ancestors so I brought that up in the email. I'll see if I get an answer?

I'm new to AncestryDNA and their match confidence predictions. Their predictions are also sometimes quite far off the actual relationship. They try to limit the number of IBS segments by phasing results. This process attempts to identify which segments come from your mother, and which from your father. Long compound segments may actually be made up of smaller segments from your mother and father. Phasing is supposed to identify these segments and break them down to their true size.This phasing process doesn't aways produce accurate predictions. I believe they should do the phasing; but, also let us see where the segments are just in case the phasing didn't produce an accurate result. I found a very low confidence match who I shared 28cMs with. I believe she may relate on my mother's line? She may not be as low confidence as it appeared after phasing. I've noticed that very low confidence matches can share long segments of DNA, or small segments down to around 10 cMs. Phasing isn't perfect, so some good matches can be rated very low.

28 cMs very low confidence match AncestryDNA

Another issue brought up at the 23andme G+ hangout was the low response rate of 23andme matches. Less than half will respond to queries. That is a difficult problem to resolve. I was thinking maybe they need to exclude people who don't share any information from viewing the trees? If you don't share any information at all maybe you shouldn't be able to see other peoples' information? Or maybe they could offer an incentive to customers? If you share your surname or family tree you get to use some cool feature? Maybe some sort of chart function or Gedmatch type utility? I am not going to test with 23andme until more information is given about matches.

I'm looking forward to the Southern California Genealogy Jamboree live stream this year; which takes place June 5. Glad to see a full day is being devoted to DNA. I got my all day viewing pass for the DNA live stream from Jamboree you can register here .


Jennifer Zinck said...

Many DNA users are adopted or have an unknown father. They do not have the ability to share a tree like those of us who are privileged to know a great deal about our ancestry. Most of them do not want to say outright that they are adopted because, honestly, most people who can't get something for themselves don't want to give anything. This post helps support their view. Additionally, some people test large family groups and do not know family history of individuals who have tested on behalf of a specific project or to answer a very specific research question. For many who I have tested, I am interested in just one line of their heritage. While I try to be polite and provide what I can, I have many test kits for which I couldn't even provide so much as the other parents' name because I do not collect all of that information for every single study participant.

Annette Kapple said...

Thanks for bringing that up Jennifer! I didn't consider the fact that some adoptees wouldn't want to share that fact. Maybe insisting everyone post a user name would be enough?

Albert Colbert said...

Annette, in regard to AncestryDNA, I have 10 matches confirmed: 1 is listed by Ancestry as "Moderate", 2 as "Low" and 7 as "Very Low". Ironically of the 95-100% confidence matches, I have never found connections. That tells me that there is some bias in the system that needs to be calibrated, because it is apparently flagging most as "Very Low", when those appear to bethe most promising.

As for 23andMe, I think that since they created the Public Match, they need to, at a minimum, make the shared segment data available without a share request. There's no information in that data that would be compromising, especially if limited to just the shared segment. Aggregating that could help active users triangulate on matches that actually do share data.

Annette said...

Thanks Albert! Yes, 23andMe can provide more information without compromising customer privacy. I think they need to do it if they want to attract more customers without the health features.

I'm also finding more connections at the lower confidence level at Ancestry. Hopefully they will recalibrate their system at some point.