Tuesday, May 10, 2016

To Phase Or Not To Phase? Plus, She's Back?

My to Quack or not to Quack souvenir  from my recent vacation


That is the question...

Is the phasing and filtering AncestryDNA does worth the extra processing? I've wavered about this for years. It sounds like a great idea. On a theoretical basis it is. In practice not so much. The phasing AncestryDNA does attempts to use haplotypes to separate the DNA we inherit from each parent. The results are also filtered in order to remove matches who share population segments. When I listened to an Ancestry representative explain the haplotype method she did say there was an error rate with the phasing. Some haplotypes haven't been encountered before. Removing population segments, with filtering, isn't helpful for me since these segments at least tell me which ethnic group a segment comes from.

A recent update to the AncestryDNA product has many discussing the merits of this companies approach to matching. An issue was brought up at the Facebook ISOGG group which I hadn't noticed. Before the recent AncestryDNA update parents and children were said to share up to 90 segments of DNA. According to the other companies around 30 segments are shared. This vast discrepancy is due to the fact Ancestry's phasing and filtering chops up segments. The recent update has brought the number of shared segments down to the 50's. Still many more than the other companies.

I've posted about the fact that during AncestryDNA's previous update, when Timber was introduced, a 3rd cousin went missing. I checked to see if she was returned immediately after the update finished. I didn't see her listed as a 3rd or 4th cousin so I assumed she wasn't returned with this update. Going through hints yesterday I found her. She has returned. She is listed as a 5th to 8th cousin now. What does this tell me? First of all the prediction is off. Secondly it tells me the AncestryDNA product is still in a state of flux and who knows what will happen with future updates? Apparently someone can match you today, and may be removed in the future, just to possibly be returned somewhere down the line?

I didn't find that any close matches were removed this time. I did find several distant cousin matches had been removed. It's possible these matches do match DNA wise? AncestryDNA states it is possible good matches were removed.

I would rather see AncestryDNA do away with the phasing and filtering. Shuffling matches in and out of our lists doesn't make any sense. It's just confusing. Does the phasing and filtering improve match results? Not in my case. Predictions at the 3rd cousin range and beyond are impossible to get exactly right. I'm not even sure if phasing and filtering helps improve predictions for closer cousins? What it can do is remove good matches.

One thing is certain, every time AncestryDNA updates results they get publicity. As someone once said "there's no such thing as bad publicity."

Wednesday, April 6, 2016

Recap Of Sunday's SL Discussion: Working With Segments

We've been having DNA discussions in the virtual world of Second Life (just consult the "Genealogists In Second Life" page at Facebook for more details). Our next discussion will be on Sunday May 1, 2016.
Here is a recap of our last meeting Sunday April 3, 2016.
This slide shows a view from the 23andme chromosome browser

  • The browser shows our 23 pairs of chromosomes. One row from our Mother the other from our Father
  • It shows where my Mom, a match, and I share DNA.
  • Mom shares from side to side on each chromosome (the purple lines).
  • Match is represented by the red segments
  • Does she match my Mom?
  • We would check either my Mom or my matches match list and see if they match
  • We would then check the chromosome browser to verify they match on the same chromosomes in the same place
  • Another view (above) of a chromosome browser shows the pair of 23 chromosomes, plus the X
  • I've listed some alleles (i.e., A's, G's, C's, T's) which are used to determine who matches 
  • One line across each chromosome always represents a parent. The companies can't tell which alleles come from which parent. Only testing parents can work this out, in order to list the alleles separately and correctly for each parent. Without that its up to software to figure this out
  • AncestryDNA phases results without parents. This often produces good results, according to them. They acknowledge a small error rate
  • The same positions are tested for each parent
  • Because of that segments two, or more, matches may look like they match in the chromosome browser. You must know whether the match is a match on the paternal or maternal side to know whether overlapping segments represent a match
  • I tested my mother so I can narrow down the possibilities
  • I borrowed someone else's grandparents to demonstrate how having grandparents segments is useful in establishing IBD segments and finding a more precise relationship with matches.
  • Here you can see the possible 3rd cousin match also has the same matching segments as she had with my mother. She matches my maternal grandmother on the same segments
  • As it turns out she is also Nicaraguan. Just like my grandmother
  • These segments are certainly IBD

  • You can see (in the slide above) that the other set of grandparents don't share the same red segments. This confirms there is no match on the paternal side
  • This view shows the segments that my grandparents gave me through my father. The side to side shares from parents are divided up by segments they got from their parents. Siblings receive different assortments of DNA from grandparents.
  • Grandparents DNA is further segmented with DNA from their parents
  • This chain of inherited DNA continues back in time, until we not longer share DNA with certain ancestors
  • Notice here how the paternal grandparents segments fit together like puzzle pieces (above)

  • If you don't have living grandparents you can recreate the segments your grandparents gave to you by testing and comparing segments with 2nd cousins
  • There is also a way to test siblings to find start and end points for grandparents shares

  • Downloading segment data from Family Tree DNA , 23andMe, and GEDmatch allows you to compare segments from matches from all three places
  • You can upload and compare the segments at third party sites such as Kitty's Chromosome Mapper. Or you can download a free app from Genomate Pro (good idea to donate too). This software will allow you to store and compare your segment match data.

  • Triangulation is useful for all testers. The definition of triangulation is having a matching or overlapping segment with two matches or more
  • It's especially useful if you don't have close relatives who have tested. In such cases a good triangulation can establish a segment as identical by descent, or not a false match
  • Triangulated shared segments, or overlaps, should overlap by at least 7 cM's
  • When triangulating without establishing segments as identical by descent, using close relatives, it's best to use segments in the IBD range. According to Dr. Tim Janzen 15 cM and 2500 SNP segments are more likely to be IBD
  • In September of 2015 Vee got in contact with me through 23andMe's messaging system. She asked me to look over her list of names, and said she had a tree on Ancestry. I determined the most likely connection was through the surname Grenier.  I have French Canadian Ancestors with that surname. Her ancestors did not come to the US through Quebec, however. Instead they came from France to New York in the 1700's
  • Last week I began using Genome Mate Pro. This app allows you to look at shared segments on each chromosome from various sources. I noticed Vee, my Paternal Aunt, a Paternal First Cousin, and a third cousin on the paternal side all matched on chromosome 6 in the same place. I found 2 others who also matched
  • I had to dig a little to find trees for the two additional matches. When I compared everyone, except my 3rd cousin, and closer relatives, I found they all shared many of the same surnames, and all had Southern roots 
  • It also dawned on me that the names shared by these 3 common matches were also the surnames associated with  all my NAD's
  • Looking through the NAD's again I believe our common connection has something to do with the surname Douglas. I had first thought Troxell was the common surname, but one match in the NAD's doesn't share that surname. Still trying to work out our connection because I don't have Douglas on my tree?
  • All of these matches share 25 to 32 cM segments, and around 8000 SNPs, on chromosome 6. This makes it nearly certain the shared segments are identical by descent
Someone at the discussion asked "Once you have verified your segment matches, did you then establish documentation through public records etc..?" Yes. You need to also compare documented trees to see how you might be related to a match. Unfortunately it's difficult to document trees at Family Tree DNA. Or maybe it's just not as straight forward? You can add stories and notes to your Family Tree DNA tree, which many of us, including me, haven't done.
We also discussed the fact that it's difficult to get AncestryDNA matches to respond to messages, let alone upload to GEDmatch.
Another problem with AncestryDNA that we talked about is the lack of specific segment data, which hampers our ability to make the correct DNA connection with our matches. Matches can match through more than one couple; so segment mapping would help to determine which couple the DNA likely came from.
Another problem someone brought up was the need to meet the genealogy proof standard, regarding the lack of segment data at AncestryDNA. Scholarly genealogy journal articles, which refer to DNA testing, include exact segment data. Without the exact data and comparisons your proof argument won't hold up to scrutiny.  
Next meeting we will discuss GEDmatch.



Tuesday, March 8, 2016

Do We Need To Be Scientists To Use Our DNA Results?

No, I don't think we do need to be scientists to work with our segment data. We don't have access to actual data to scientifically analyze our own results as far as coming up with IBD statistics anyway. We can look at population genetics, but these scientists haven't evaluated the results from the new large genetic genealogy databases. They can't tell us what the likelihood the 15 cM segment we share with a match is from our common 4 x great-grandparents.

As genetic genealogy testing consumers we must rely on the data analysis the companies are providing us with. Companies like AncestryDNA are putting out statistics like these:

To me a 99% probability is good? This would imply that my Aunt's 5th cousin match sharing a 20 and 15.7 cM segment more than likely shares IBD segments? The 99% probability means they likely share an ancestor within the past 6 generations. If we take away 10 cM's  for phasing, timber etc. we are still in the 99% probability range.

Here I've crossed out lines that are not Colonial American in my Aunt Loretta's tree. Most of these ancestors came to the US at the turn of the 20th Century. The only line that would correspond with this match would be through Mary Owens born 1852. Her ancestors were all Colonial Americans. Aunt Loretta's 5th cousin match has mostly Colonial Virginian ancestry. I haven't traced any lines expect one back to Colonial Virginia. I have my Aunt's lines back 6 generations, and more, except in Ireland. I feel confident after examining the tree of our match, and my Aunt's tree, that any matching segments more likely than not match to our Colonial American ancestors.

If we find a triangulation on one of these segments, with good overlap, and that happened to be a 5th cousin match, I don't understand why that would be suspect either? As some would say. They would agree with this statement by Ancestry "In populations that have grown rapidly in the past 200 to 300 years, individuals are more likely to be related to each other through two or more ancestral couples. Such population growth may also lead to marriages between related individuals, such as second cousins. As in the founder effect, this also leads to an excess of DNA sharing, but due to multiple common ancestors living one to two hundred years ago, rather than thousands of years ago." The triangulation naysayers would say how can you tell where such a segment comes from if you might share several ancestral lines? Maybe you can use segment mapping or build out your tree as far as possible? Some are also saying that segments shared between those with many Colonial American lines are the result of endogamy. Read Ancestry's statement again. They are saying they aren't seeing the founders effect?

At AncestryDNA the lack of segment data makes it impossible for us to even try to decide which lines, out of maybe a couple possibilities, we might match on. You don't have to be a scientist to map segments.

I think most of the experts in the genetic genealogy field would agree with this statement from Ancestry:"The longer the stretches of evidence for identical haplotypes the more evidence there is that the identity is due to a recent common ancestor."

 From Ancestry "Our test set contains over 150 genotyped samples from a large family with a well-researched pedigree containing about 2,500 relationships that vary from 1 meiosis to 15 meioses. In order to estimate recall, we must know whether a given pair of individuals has IBD." Ancestry compared 150 samples to find IBD, so we can find IBD by comparison. Wonder if they had Colonial American ancestors? I realize they only used this test set for finding IBD in close relatives, but this would seem to suggest mapping would help in establishing a relationship with more distant relatives.

We need the studies carried out by Ancestry and the other companies to guide us as to whether our matches have a good chance of sharing ancestors within the past 6 generations. We need good statistics regarding the likelihood a segment is IBD. Personally if a segment has a good chance of being IBD and I share a set of common ancestors with a match I have little doubt where the segment came from.

Reading Ancestry's white paper for matches did give me some pause for thought...
"third cousins, in fact, are only about 98% likely to have any IBD." Only? That's a pretty high percentage to me? That statement is a little troubling. I hope the rest of the reasoning behind their analysis is better?

Given the chance I think non scientists, provided with segment IBD probabilities and good trees, can make valid connections with matches

Friday, February 26, 2016

Comparing Match cMs At Different Sites

After a discussion at ISOGG Facebook I decided compare the data from matches who have results in multiple places including AncestryDNA, Family Tree DNA, 23andMe, and GEDmatch. I copied all my mother's match names from these site. I then sorted the names alphabetically. I found it was impossible to compare with AncestryDNA testers because most do not use their first and last names. Because so few testers use first and last names I was not able to use this method to find testers who were also in the other databases. It would be too time consuming to pick out those using their own first and last names. So I decided to do a more scaled down comparison using known cousins who have results in multiple places.

My results demonstrate that segment cM's are generally close to the same when comparing at Family Tree DNA, 23andMe, and GEDmatch. I did find a case where a segment cM's were 10 cM's apart between Family Tree DNA and GEDmatch. SNP totals at GEDmatch are often lower. Now I know to turn down the SNP totals when comparing at GEDmatch. I'll use 500 SNPs now.

Since AncestryDNA doesn't share their segment information I couldn't compare using segment totals. Instead I compared with cM totals. I didn't use segments under 7 cM's in the Family Tree DNA totals. It looks like GEDmatch always has the highest total cMs.  Ancestry always has the lowest. The average difference between Ancestry DNA and the other sites is 17 cM's. AncestryDNA phases and filters matches raw results, which is the reason for the differences in total cMs

Most of these matches are predicted in about the same cousin range at Ancestry and the other sites. The problem can be seen in my first chart. 23andMe, Family Tree DNA, and GEDmatch all show the person in chart one line 1 as a match. This person did test at Ancestry isn't a match with my mother there, even though she is a confirmed 4th cousin. I hadn't noticed until putting this together. I'm noticing more matches at the other sites who don't match at Ancestry. I have at least 5 confirmed cousins who did match at Ancestry, but don't now. Likely because of Timber. I'm not seeing this when looking at matches elsewhere. I'm sure some don't match at Family Tree DNA, but match elsewhere because of the 20 cM requirement. I have not encountered that because 1 cM segments are included.

Someone said if the results are different between sites what difference does it make? Ok, if each company has slightly different ranges but come up with the same matches then there isn't any problem. If confirmed cousin matches are being lost than I believe the companies should be rethinking their testing and matching procedures. Third cousins, and more distant cousins, are the ones affected by unreliable matching techniques. If a match shares only once segment they are more likely to be disappear as a match with additional processing.

Putting this together I have found more difficulties working with AncestryDNA than the others sites.
  1. Ancestry doesn't allow you to download matches or their cM numbers (I used the chrome extension. Doesn't include cMs). 23andMe and Family Tree DNA allow you to download spreadsheets.
  2. Ancestry should encourage testers to provide full names if they want to participate in sharing with other testers. I understand why some may not want to use their real names. They should use a consistent pseudonym, and use it everywhere, if they want to collaborate.
  3. It would be nice if we could filter matches by total cMs.  
  4. It would be nice if we could search by username.

Tuesday, February 23, 2016

Triangulation Example

Some ISOGG group members at Facebook have been wanting to see examples of triangulation at the 6th cousin level. My Melvin family segment triangulations would be closest to this cousin range. One match is a 5th cousin 1x removed, which is pretty much equivalent. This triangulation is with descendants of  John Melvin b. abt. 1776, Maryland and Mary Redden b. abt. 1777 Maryland. The Melvin segment matches are as follows (see chart above, which includes my Aunt, myself, and two other distant cousins):
  1. The light blue segment, on chromosome 1, represents my Aunt on my Paternal side. She shares this 22.1 cM segment with a 3rd cousin. This match is a descendant of our common ancestors John Melvin and Mary Redden.
  2. The light pink segment, on Chromosome 1, of the same size is my segment match with the same person as my Aunt. This is a 3rd cousin 1 x removed to me.
  3. The smaller dark pink segment sandwiched between the ones described above belongs to another John Melvin and Mary Redden match. This 14.2 segment is also shared by my Aunt and myself. This match is a 5th cousin to my Aunt, and a 5th cousin 1x removed to me.
  4. The green segment is where the 3rd cousin range match, to my aunt and myself, matches our 5th cousin range match. These 5th cousins share a slightly larger segment which is 18.6 cMs. You'll notice it extends passed the segments my aunt and I share.
Elijah Hicks and John Melvin sign
marriage bond
Both of our Melvin matches have good trees. Our 5th cousin range match has all lines going out at least 6 generations. Looking at other possible lines which may also be the source of these segments I don't see any other matching ancestors.

My proof of relationship to this Melvin family is through  US Census research, a bible record, and the Elijah Hicks and Nancy Melvin marriage record.

Examining whether these segments are likely IBD it would seem that they are in that cM range. Checking to see if my mother shares the same segment on chromosome 1 with all of these matches. No she doesn't match. You can see here my comparison between my paternal aunt with my mother. All of our Melvin matches matched between 165,698,481 to 180,598,459 on chromo 1,:

My Mom and my Paternal Aunt's shares in the same place as a 3rd cousin and 5th  Melvin cousin match

Looking a little more for possible places where our ancestors may have crossed paths I made this chart. Are we all from the same ethnic background? Could these be population segments? My paternal aunt and I have a fairly unique ethnic makeup. My 3x great-grandparents William Owens b. 1820 and Nancy Hick b. 27 Oct 1831 and their ancestors make up our only Colonial American line.

Here you see our lines are Austro-Hungarian, French Canadian, Colonial American, and Irish Catholic.  When looking at the places of origin for the Melvin matches of my aunt and I, we find both have quite a bit of Colonial Ancestry. I don't see any other shared ancestors between either of the other two testers. We all have Colonial Ancestry, but no other shared ancestors. My Colonial line on the paternal side is very small. Neither tester has French Canadian roots like my Aunt and I. Neither has Burgenland, Austria ancestry, as my Aunt and I do. They don't have Irish Catholic roots either. Our 3rd cousin match has a large Scandinavian line, which none of the rest of us have.

I think it's more likely than not that this Melvin Family triangulation is a good triangulation.

Some of those who would discredit triangulation would say, well it could just be a coincidence that we all match in the same place on chromosome 1. They would also say it's nearly impossible to share segments with cousins in that range. Chances of matching at all at that range are minuscule. It would be like being struck by lightening to triangulate at that cousin distance, so they would say. What are the chances we would all match in the same place and share the same ancestors? Wouldn't that be as unusual if you are sticking with statistical probability? I have a feeling we have a long way to go before we even really know what the statistical probabilities are? We aren't able to do enough comparisons, or look at enough possible triangulations to get an idea of how likely or unlikely they are to occur. A company is holding a huge amount our genome information, but they aren't sharing it with customers. They will sell genomes for medical research though.

PS This company now has no chromosome browser in 29 countries!

Wednesday, February 17, 2016

DNA On Fire AncestryDNA 4th Quarter 2015 Report

The Fourth Quarter, and 2015 full year report, at AncestryDNA emphasized the importance of the DNA product. This product has resulted in an increase in Ancestry subscriptions, which is Ancestry's core product. The 1 million new testers in 2015 helped increase subscriptions from 2,115,000  in the year ending December 2014, to the 2,264,000 in the year ending Dec 2015. An increase of  149,000. The subscribers who came to Ancestry through the DNA product are more engaged, and tend to subscribe to more expensive packages. They also tend to renew their subscriptions, according to Ancestry.

AncestryDNA now has 1.5 million testers in their database. The reason we are not seeing more tools like, a chromosome browser, is because sales are "on fire" according to one Ancestry Official. Black Friday 2015 sales were up 200% over last year. AncestryDNA has a lower profit margin than subscriptions, so as long as sales are brisk we won't be seeing new tools which would cost money to add.

The Ancestry Executives were also asked if the new medical focus has resulted in more hacking attempts? One Executive said he didn't want to divulge that information. Interesting at that point in the conference call the line suddenly went dead. I thought, were they hacked lol?

One Executive said a show Ancestry is sponsoring will likely increase DNA sales. Long Lost Family which will premiere its second season on TLC  March 6, 2016 will be sponsored by Ancestry. Sounds like it's based on a British show.

The sentiment regarding the DNA product's 2015 sales, and the current 2016 sales, has led these CEO's to forecast continued fiery sales of the DNA kits in 2016.

Monday, February 15, 2016

DNA: Not The Endogamy Of Cousins

We've been discussing endogamy in the ISOGG group. I took a poll at ISOGG a few weeks ago and learned that first cousin marriage was not uncommon in earlier times. This is true of certain groups. People living in isolated places with few prospective mates, for instance. I have not found that kind of cousin marriage in my family. Most of my lines go back 6-8 generations. My ancestors sometimes married outside of their ethnic group, and also married mates who came in different waves of immigration from the old world.

When I say not the endogamy of cousins I mean for most of us we aren't seeing any close cousin marriage out to 6 or 8 generations. We do see the affect of the smaller early American population. When my American ancestors came to America they initially settled in Pennsylvania and Maryland. They migrated from there to either the Midwest or South. Small populations in these areas, and common migration patterns,  could mean my matches' ancestors may have crossed paths with my ancestors more than once. I have found in a couple of instances that I could be related to a match through two different couples. This is a potential pitfall if you haven't carefully compared your tree with a match. This isn't the endogamy of cousin marriage, because it doesn't represent close cousins marrying, it's just that your ancestors crossed paths more than once when the population was smaller.

This is where mapping out chromosomes helps. Using the segments of matches, and your immediate family, going out to the 3rd cousin range you can begin naming your segments for family lines. I will never be able to do this for certain ethnic lines due to the lack of surviving records in the home country, so in that case I just name the segments according to ethnicity. Filling in the chromosome chart with named segments helps to identify matches who's segments overlap with confirmed family.

We are able to collect these segments at Family Tree DNA and 23andMe.  Family Tree DNA makes it easy by allowing us to see the segment information for all matches. At 23andMe you generally have to ask to share. 23andMe now has opt in sharing, which is working out for me better than expected.

At AncestryDNA there is no way to see the chromosome information. This creates a problem considering our ancestors may have crossed paths more than once. Without the possibility of mapping how do we know which or two, or more, couples we may have gotten our shared DNA from? AncestryDNA also has more of the segments we need to create such a map. I have closer matches there than at the other two companies. Their data could help me a great deal, and all of us.

Ancestry believes in DNA mapping because they recreated the genome of David Speegle using this technique. Some say we have segments going back to endogamy; if not more recently from America then going back to the old countries. These suppositions didn't seem to affect Ancestry's genome recreation?

Many will tell us compare at GEDmatch. Few of my matches have agreed to compare there. The process is confusing for those who aren't computer savvy. Others worry about the privacy of the site. The best solution, which would produce the most compliance, would be an opt in segment sharing system, like 23andMe.

Another problem with AncestryDNA is the problem plagued messaging system. If we don't hear from our matches it could be they aren't interested in sharing, or they didn't get the message at all?

Here is what Mapping can do for you:
  1. A well filled in map can help identify the ancestral route of a segment. This helps even when a match isn't cooperative.
  2. It can help to identify which of two or more couples a segment came from.
  3. It can help you eliminate IBS segments. You might find segments your parents don't match on.
  4. If you're using smaller segments as proof mapping can help confirm them.
  5. Ethnicity Chromosome Chart
  6. Chromosome matching segment maps can be compared to ethnicity chromosome maps to confirm ancestry. If you are 100% European the ethnicity chromosome chart won't help.
Without mapping we are hamstrung in certain situations.

This couple (below) now has 59 members in their circles. This could greatly help with chromosome mapping. Maybe Ancestry will sell us their genomes at some point?

Tuesday, February 2, 2016

Wife Of 3RD Cousin 5X Removed In An Ancestry Circle?

I misunderstood the Circles at AncestryDNA. I had thought they were reserved for direct line ancestors. Apparently they can include aunts, uncles, cousins and their spouses; if they are in your tree. I just found a Circle for a several times great-aunt. When I click on her Circle I'm listed as a potential descendant. This can cause confusion if you don't read all the descriptions carefully. I'm not included in the Circle though.

To me it would make more sense to include non direct line ancestors in NAD's. The Circles should form for the strongest links. If they are extended beyond that to the wife of a 3rd Cousin 5x removed, for example, then we are getting into some very weak associations. Couldn't ancestry just exclude certain relationships from Circles?

Friday, January 15, 2016

Reconstructing My Grandfather And Great-Father's Genomes

Fred Mason's sons Edwin and Frank
I am trying to reconstruct and color in the chromosome charts for my Maternal Grandfather Charles Lynn Forgey and Paternal Great-Grandfather Fred Augustus Mason. Since Charles Forgey's wife was Nicaraguan it's easy to separate out DNA that belongs to him. I'm only using identified segments to reconstruct his genome. Segments associated with Fred Mason are easy to pick out because his wife was Irish, and he was French Canadian on his father's side. He had early American roots on his Mother's side..

I've thrown out more requests to compare results to close matches relating to these men at AncestryDNA. No answers yet. If these cousins would compare it would certainly help fill in my charts.

Don't have a picture of Fred Mason
He died in 1917 these are his children
My Grandmother far left
I'm using an Aunt's results and some cousins results to fill in my Great-Grandfather Fred Augustus Mason's chart. I can see some X shares between an Aunt and a 2rd cousin 1x removed. I also share an 18cM segment on the X with the same cousin. This would go back to the Owens line because Fred's father would not have passed his X along to him. I'm not even attempting to name French Canadian segments due to endogamy.

According to Ancestry those with the triangulation on chromosome 12 only share about 7 cM's; according to GEDmatch they share 14 cM's. They match both places. The 7 cM difference is common between GEDmatch and AncestryDNA. As I said before I'm only using the segments of paper trail cousins.
This is where I am with my Grandfather Charles Lynn Forgey's chart. I am using cousins and my Mother's results to fill this in. If the 70 cM and over matches at AncestryDNA would compare I could definitely make great progress on this chart.

Saturday, January 9, 2016

DNA: Triangulation, Pileups & Endogamy

There has been some debate at the ISOGG group about whether triangulation is possible beyond 4 generations. For triangulation to work the segments we are comparing would need to go back to a common ancestors within the genealogical time frame. The DNA testing companies estimate the regions of the genome they are comparing contain uncommon SNP's. They estimate when you share segments with matches that generally the relationship isn't much farther back than 6 or 7 generations. If this is the case then triangulation is possible if both you and your match have a tree that is fairly complete to 6 generations. Even if it isn't complete you can reasonably draw inferences about what the rest of the tree might look like. If someone is half Italian and you haven't found any Italian ancestors you can easily eliminate half that persons tree. In other cases in the US, for example. you can reasonably eliminate possible ancestor matches based on the region of the country they were from.

Some are questioning the age of the SNP's we inherit. Are they 200 or 300 years old or are they ancient, 500 years old or older? We generally share very little DNA with ancestors who lived 200 years ago. It's hard to believe that we would continue to share SNP's from 500 years. If we do it sounds like it would be a very small number and the amount of DNA would be very small, and would not be considered a match by the testing companies.

Some cite endogamy as the reason 500 year old and older SNP's persist. There is a high degree of interrelatedness among those of us who have Colonial American ancestors. Americans whose families remained in the same eastern seaboard areas since Colonial times tend to have problems with endogamy when they DNA test. Although those living in urban eastern seaboard areas tend to be more ethnically mixed as waves of immigrants settled these areas. The amount of interrelatedness among Americans varies. Even if someones ancestors lived in the same rural area for hundreds of year it doesn't mean they are highly genetically related to their neighbors. You might also see more recent immigrant groups, like the Italians, coming in and adding to the gene pool in rural areas. Many Scandinavians settled in the Midwest adding their own genes to the mix. Many of us on the West Coast of the US have Hispanic or Asian genes. This dilutes our Colonial American gene percentage.

Most of my Colonial American ancestors were Scot-Irish and German. I can pinpoint exactly when they came to the US in the 1700's. As for some of the others it's possible some of these lines go back to the first settlement of Jamestown? Could I be mistaken and some of the segments I've named actually go back to another ancestor who settled early in Jamestown? Or even go back to an ancestor back in England? I would think the odds are low considering the odds of actually still having a measurable amount of DNA, from that far, which would be enough to signal a match.

So why do we have so many matches piling up on one segment. Would sound like these are old SNP's that many people inherited and are common to certain ethnic population? Or maybe there are other reasons? Most all of my matches at AncestryDNA are from the same family group Roller/Zirkle/Roush. These families tended to marry close cousins because they lived in the isolated Shenandoah Valley, and I'm sure there were language, religious and cultural differences. This endogamy means that their descendants potentially have retained more of their DNA. My family never stayed in the same area for more than a generation or two. They didn't marry close cousins. Since we have inherited small amounts of DNA from our German ethnicity Roller/Zirkle/ Roush families we tend to match this family group more than any other group. We tend to get a match with one of those families once a week. We have 5 DNA Circles for these families. The likely reason for this is that our matches have ancestors who married cousins in this family group. Often I will see, for instance,  Zirkle and Roush on their tree a couple times at the very least. These are our ancestors from around 250 years ago. We match so often because many of these families lived in the Shenandoah Valley for generations, and continue to live there, so these genes continue to cycle through the population. They have more DNA from these ancestors to potentially match with.

Another reason for pileups is large numbers of descendants. In America families tended to be large before urbanization. The survival of children into adulthood tended higher than in Europe. American couples living in 19th Century America have large numbers of descendants living today.

America was settled during the genealogical time frame so this should mean that triangulation is possible. All of these facts I mentioned mean you need to build your tree out as far as possible, and compare with as many cousins as possible. The odds of sharing the same segments with the descendants of the same ancestors may not be statistically high. Considering the number of descendants some ancestors left I think it is statistically possible. The major problem I have is the lack of records dating back to the 1600's in the Mid-Atlantic and Southern States. Otherwise I believe triangulation is useful and accurate if other parties have reasonably complete trees. Odds are reasonably good the segments don't go back to the 1600's. Plus, in my case, only 31% of my ancestry goes back to Colonial America. Much of that ancestry is already traced back to the immigrants.

Could large proportions of the early population of American have shared recent ancestors because they came in a mass migration? I believe the early population of the Mid-Atlantic states and South was more varied? New England may have had a more homogeneous population coming from the same stock in England.

I'm a believer in Triangulation. The more testers we have the more opportunities we will have to make connections through Triangulation.

Without triangulation DNA testing will be useless for Americans with a high degrees of interrelatedness. How will they separate their lines?

Monday, January 4, 2016

Trip To Nicaragua And DNA Cousin Match

Mombacho Volcano as seen from Granada

I was in Nicaragua from December 7th to the 12th site seeing, and researching at the archives in Granada, Nicaragua. It was a fabulous trip! I loved it there. Beautiful scenery, lush and green. Exotic animals, such as the loud howler monkeys I heard while touring a volcano. Warm weather. It was in the 90's during the day and the 70's at night. Beautiful Colonial adobe architecture in Granada.  I stayed one night in Managua and 4 nights in Granada. My mother, Edna Forgey-Kapple, was born in Granada, Nicaragua to a Nicaraguan mother and a US Marine father.

I had very little information about the Nicaraguan side of my family. The only info I had came from my grandmother Graciela Del Castillo's death certificate, some information about the siblings of my grandmother, and a will she made which named a cousin. The will didn't give the degree of cousin he was. I matched a great-granddaughter of this cousin, Francisco Alvarado Granizo, at AncestryDNA. Until the recent addition of the total cM numbers at AncestryDNA I didn't know how much DNA we shared with this cousin, because this cousin has not uploaded to GEDmatch. I share 24.7 cM's and 1 segment, and my mother shares 20 cM's on 1 segment. This shouldn't be. I think this reflects the problems with AncestryDNA's Timber filter. I don't place that much confidence in the cM numbers, which tends to be on average 7 cM's different than everyone else due to the Timber filter and phasing. According to AncestryDNA we are 4th to 6th cousins of Francisco Alvarado Granizo's Great-Granddaughter. I didn't know of any surnames shared in common? No Alvarados or Granizos that I knew of. But my family history for my Nicaraguan family only went back to my great-grandparents and their children, and their children's spouses.

I had no idea that my first day in Granada, Nicaragua was a National Holiday in Nicaragua. It's called La Purisima. It's the feast day of the Immaculate Conception. I guess I'm not that good a Catholic because I had no idea. I couldn't do any research that day due to the fact the archives were closed for the holiday. I had a great day anyway though. I went on a  Colonial Homes tour and attended part of the Immaculate Conception feast mass, which was followed by a several blocks long procession with the Statute of the Virgin which included music from a band. I agree with a Youtube comment "Mary is Nicaragua and Nicaragua is Mary."

I had heard these celebrations can lead to a week long closures of government offices. I lucked out and the Municipal Archives opened the day after the Holiday. I was thrilled. It was very hot in the Archives room which didn't have any air conditioning. I melted. There is definitely some of my DNA on the records at the Archives because perspiration was dripping. They had double doors open which did bring in a breeze. The tropical plants outside the door looked nice, during my breaks I looked out at them. I was also serenaded by lovely piano and violin music from the next room. I recognized Yankee Doodle being played at one point. The Archives is located in a public cultural center. Ballet Folklorico was also danced outside in the courtyard. My Grandmother definitely danced there also, because this center was a theater when she lived in Granada.

Nicasio's signature and personal
flourish or rubrica 
My extremely limited Spanish vocabulary meant communication with the archives staff was difficult. I printed my family tree and showed that to them. This did help a great deal. I knew they had a couple Censuses for Granada from the 19th Century. I was able to explain I wanted to look at these. I had no luck with the first Census I looked at which was falling apart and missing pages. One of the archives staff members found my family on the 1882 Census for Granada. I had no idea that wives were listed with their maiden names. Like French Canadians, Nicaraguan women retained their maiden names. I was so excited when I found out my great-great grandmother's maiden name was Granizo. Now we have a common surname with the Great-Granddaughter of Francisco Alvarado Granizo. Based on this our relationship to Francisco Alvarado Granizo could be 2nd cousin 1x removed for my Mom, and a 2nd cousin 2x removed for me. Based on the shared DNA with his great-granddaughter this could be the case. If I'm calculating correctly his great-granddaughter could be a 3rd cousin 1 x removed to my mother. The 20 cM share would fit with this relationship range, with 3rd cousins 1 x removed sharing from about 11 cM's to about 100 cM's. I still have several brickwalls on my Nicaraguan line so this relationship is one possibility. Still I'm thrilled to finally have a common surname with this DNA match.

I was also able to solve a mystery regarding my grandmother's father. Someone named the wrong Nicasio Del Castillo as her father. I was thinking that Nicasio, who was President in 1856, would have been way too old to have been her father. That was a correct assumption. From the 1882 Census I found out that there was a younger Nicasio Del Castillo who was only 16 in 1882. The correct age range to have been my grandmother's father. His father was Francisco Del Castillo. According to a niece of my grandmother the Nicasio who was her grandfather, and my Grandmother's father, was the son of a Francisco. The 1882, 16 year old, Nicasio's father was Francisco. Francisco was an attorney. My grandfather Nicasio was also an attorney. I'm so glad my mother told me her grandfather was an attorney because this profession seems to have been passed down in the family. According to other documents I've found Nicasio, the President, was the father of Francisco and the grandfather of my Nicacio Del Castillo Granizo. The elder Nicasio is listed next to Francisco on the 1882 Census and was 66 years old then. According to other documents he probably died in 1884.

My entire trip was a success. I was able to add 3 new ancestors to my family tree and another surname. My Nicaraguan line tree still looks sparse, but is quite good by Nicaraguan standards. Due to record losses family trees are generally short. I'm hoping to return to Nicaragua in the near future with a Y DNA kit. Hope I can find a male Del Castillo to take the test. Y testing could take my Del Castillo tree back to 1600's Seville, Spain.

The Director of the Arts Center Dieter Stadler , who is Austrian, asked me if I came to Nicaragua solely to research in the Archives. Would I travel over 3,000 miles just to look at a couple of Censuses? Probably... I also wanted to see the place where my Mom was born. Visit the church she was baptized in.

I'm praying for Nicaragua, as my mother did. When ever there was a disaster my Mom would say it hurt her because that was her country. Now I feel like it's my country too. Before my Mom passed away last August I told her I planned to visit the place where she was born. It's a beautiful country with friendly beautiful people. Tourism is helping this very poor country. I'm hoping to see continued progress when I return.

I have a PDF and paperback copy of the catalog

Wednesday, December 2, 2015

23andMe Shared Matches A Week Later

Still early as far as the introduction of the new open sharing feature at 23andMe. So far 7 people are open sharing on my mother's match list, and 10 are open sharing with me. I'm hoping the open sharing numbers grow. Anyone interested in using 23andMe for genealogy should agree to open sharing. You do need to check a box in order to participate. You are not automatically included in open sharing. (See this blog to get instructions for participating in open sharing "How To Opt In")
I'm hoping more people agree to open sharing? The wording for the opt in wouldn't encourage many people to share openly.
"By selecting open sharing, it is possible there is the risk that other DNA Relatives or other users will be able to identify certain information about you, including specific genetic variants related to health."
I'm not confident that many people will agree after reading this disclosure?
Before the changes I had access to the exact location of over 1000 shared segments. Half of those contained a match name, and the other half were anonymous. The anonymous matches could also be helpful since they listed the origins of all grandparents. Many of the segment matches were substantial in size. Of those either open sharing, or just sharing with us, most share smaller amounts of DNA. I do like the fact 23andMe provides a chromosome browser. The lack of trees and cooperation of matches makes 23andMe more difficult to glean useful information from.

Below is the now eliminated Countries of Origin information. You can see one match shares a 71.6 cM segment and another a 63.4 cM segment etc. This was very useful information because many of these matches did not agree to accept my sharing invitation. Sadly this information is no longer available. I still think adoptees should test with 23andMe, since you can get some pretty close matches. The more distant matches I'm looking for are more difficult to confirm now. I wouldn't recommend this test to those looking for cousins past 2nd. The cost is too high for the limited information you're likely to get.

The fact AncestryDNA now provides some segment and total cM information does make this product more useful. Today I found a match on my Lambert line. When I looked at the segment information and total cM's I discovered I shared 7.6 cM's on two segments. This doesn't look like a very promising match. Since Ancestry is using the Timber filter further comparison at GEDmatch is needed to see if we actually share more DNA. I would recommend testing with AncestryDNA, but comparison at GEDmatch is often needed to confirm matches.

 23andMe can make their test more useful for genealogists. Providing a good tree function at their site would be a step in the right direction. Right now AncestryDNA is the best place to test.

Monday, November 23, 2015

AncestryDNA takes a few steps forward/ 23andMe steps backward and A Trip to Nicaragua

AncestryDNA now provides some cM information

Important steps forward for AncestryDNA. First Ancestry introduced shared matches then a couple weeks ago they began allowing us to see exactly how much DNA we share and how many segments we share. Valuable information to have in order to evaluate matches and make connections. The DNA information isn't easy to find unless you do some exploration of links on your matches' pages. This information is shown when you click the "i" next to the confidence level. I've been able guess at some possible relationships using the shared match feature. Seeing the basis for matches looking at the shared DNA and number of shared segments has allowed me to evaluate the quality of my matches.

I was quite disappointed when a 3rd cousin was predicted to be a 4th to 6th cousin a couple of weeks ago. I feel this is a bad call. According to Ancestry this person shares 50 cMs with me, which is in line with a 3rd cousin relationship. Glad I was able to see the shared cM's so I could dismiss the AncestryDNA prediction (sounds like someone at ISOGG on Facebook has a match sharing 6 cM's on 2 segments???). A second cousin's results came in a week ago and his relationship prediction was accurate. Looking at other matches I see that on average Ancestry is 7 cM's different than Family Tree DNA and GEDmatch. They can occasionally be as many as 20 cM's off. I think AncestryDNA should dump the Timber filter and use a more accurate filter. Sounds like more accurate filters process more slowly and are more costly? I would still like to see a chromosome browser. I'll lift my grade for AncestryDNA to a B. I would give it an A if they would provide a chromosome browser.

23andMe is taking steps backward with their genetic genealogy product. The FDA is allowing them to provide health related results again. The health product was the primary focus for 23andMe, and will be again. They are completely revamping their DNA product. The very useful "Countries Of Origin" tool is now gone. Without this tool 23andMe is far less useful because most matches won't agree to share genomes. The price has increased from $99 to $199. I wouldn't recommend this test for that price. Without "Countries of Origin" you are unlikely to get very much information from matches. The health results aren't generally useful unless you have a clearly defined genetically inherited disease risk. Lowering my grade for 23andMe to C- overall. They do get an A for their ethnicity product, which is virtually the same.

If you'd like to read more about the changes at 23andMe you can read this more in depth explanation at Kitty Cooper's Blog. I noticed I have double the number of matches (about 1800)  I had before, but most are anonymous. Also some of the physical characteristics reports were far off. My mother was predicted to have dark eyes. Her eyes were light hazel. My eyes are dark which is correct.

Trip to Nicaragua:
I plan on leaving for a genealogy research trip to Granada, Nicaragua on December 7 (if all goes according to plan).

I have done some preliminary research. I've exchanged emails with an archive employee. He said that a staff member has found some information about my family. I have also learned about what is available at the archives from a distant cousin Alan (who is a DNA match). He has made a number of research trips to Nicaragua. He provided me with an index to the archive holdings.

My primary research location will be "Archivo de la Prefectura de la Municipalidad de Granada, Macario Álvarez", which contains 1,653 bundles of documents. This archive contains important genealogy sources such as Censuses and Military records. Another good source for Nicaragua was explained to me by my distant cousin Alan i.e. "recursos de habilitación are one of the more obviously genealogical series, they are coming of age documents usually the offspring of well to do families, in which they state that they are of legal age to enter into the administration of their patrimony and are x years of age, and their parents are x & x.  I have not used this collection very much but it is specially useful for Granada families."

I hope to find more about our cousin Francisco Alvarado Granizo, and more about my Great-Grandparents Nicasio Del Castillo and Elena Garcia. According to my Aunt Grace, the informant on my Grandmother Graciela's death certificate, her parents were Nicasio Del Castillo and Elena Garcia. My Mother knew her grandmother was Elena. She didn't know her maiden name, or her grandfather's name. I believe Aunt Grace was a good informant because she worked as a secretary for many years and was very organized when it came to keeping documents. My mother said her grandfather was a lawyer, which seems to suggest a relationship with Nicasio Del Castillo who left 28 years worth of legal books at the Granada Archives, which dated from 1857 to 1884. This Nicasio would seem to be too old to be my grandmother's father? Since the legal profession tended to be a family profession the elder Nicasio may have been my grandmother's grandfather? My grandmother was born in 1893.

The death certificate for my grandmother Graciela Del Castillo is the only document I have naming my great-grandparents.

A few years ago I exchanged emails with a distant Del Castillo cousin. He was living in Central America at the time. He provided me with the names of the siblings of my grandmother Graciela.  I found out her brother Alberto was entombed in a Mausoleum in Granada. I will try to locate his tomb.  I was able to find the exact relationship of the cousin pictured right with the help of this Central American man who did some research for me.

Most of Granada's 1856 and before government records were destroyed in that years due to an American William Walker taking over the presidency of Nicaragua, and the violence of that take over. I'm hoping to search an 1859 Census and an 1882 Census. Since I need more information regarding the identity of my Great-Grandparents the fact that earlier records are missing will not affect my initial research. In order to trace my family farther back marriage records called "expediente matrimonial" will need to have to have survived at the Catholic Cathedral diocese archives.

It will be interesting to see where my Grandparents and mother lived. My Grandfather Charles Forgey was born in Indiana. Ran away from home at age 17,  joined the Marines in 1916 and was sent to Nicaragua. He married my Nicaragua native grandmother Graciela Del Castillo in 1919. My mother Edna was born in 1921. The family left Nicaragua in 1925 and settled in California. I'm a little apprehensive about traveling to a "Third World" country. I've gotten hepatitis and Typhoid vaccinations in preparation. Hoping all goes smoothly?