The data privacy threat lurking on your family tree

If you keep your genealogy research online, you’ve probably got a data problem.  But unlike banking sites or Facebook, simply changing your password or deleting your account won’t solve the issue.

That’s because the world’s largest for-profit genealogy services (the vast majority of sites exist to make money) are using your family-tree and DNA data to fuel their expansion.  And while there’s growing awareness of ethical issues surrounding data privacy and ownership, don’t look for any of the major services to change their practices anytime soon.

Genealogy sites have been drawing a lot of news coverage lately about their data.  

When genealogical DNA service MyHeritage disclosed last week that cyber thieves had stolen data on 92 million users, it generated a flood of news coverage.  Much of the focus was on privacy concerns, including mention of how another site, GEDMatch, recently shared user DNA data with law enforcement officials to help solve a cold case murder.  MyHeritage claimed no user DNA data was taken in the theft, and say they’ve remedied the situation with added log-in security.

While coverage of data privacy and ownership issues with genealogy services has mainly focused on the prospects for shenanigans with stolen or misused DNA data, the biggest issue facing amateur genealogists today isn’t who’s stealing your data.  It’s who actually owns (and profits from) your DNA data and family tree research. is a brilliant business, or devious scam, depending how you feel about data ownership and privacy.  The company has built a lucrative business combining historic public domain records (over which it somehow manages to claim copyright), along with a user-friendly web-based family tree creation app.  All of this is bundled as a subscription offering for $19.99/month. Want international, military, or archival news? That will cost more, and can quickly double the monthly subscription price.

But what makes the Ancestry model ingenious is the way it entices paying subscribers to do much of the heavy lifting that makes the service valuable.  By creating and sharing family trees, subscriber research becomes a springboard for other users to build their own trees, enabling Ancestry to benefit from the network effect.  But unlike social media platforms like Facebook or Twitter, in the Ancestry world everyone pays. And under Ancestry’s terms of use, the data you add to the site, such as family photos or documents, stays with Ancestry to use and profit from as it chooses (it can even sublicense your data to others).  Want to end your subscription? Ancestry retains the rights to your data.

Like Facebook, which has had its own issues with data privacy and ownership, the businesses behind these sites have been loath to act in any way that might hinder their ability to make money. They’ve bulked up on a healthy mix of rationalization, denial, and legal language to protect their freedom to use your research as they see fit.

To get a sense of how large the issue is, take a look at the privacy statements posted on any of the major genealogical services.  All purport to be written in plain language, although when you start to parse what’s really being said, you might want to bring in a lawyer to help.  

You’ll also need some extra time on your hands if you actually want to read them.  MyHeritage tops the list with over 9,200 words, followed closely by 23andMe with 8,700 (plus a quick summary section of 927 words). Ancestry would seem almost reasonable at 4,400 words, until you realize there’s a separate terms and conditions page with an additional 4,600 words.  By comparison, Google, the world’s most popular site and the Big Kahuna of data collection, is a model of brevity, with a 4000-word summary that also includes links for deeper dives on specific topics (e.g. sensor data from your devices).

What these sites have to say about your data privacy amounts to something you probably already know:  You have none. At least not beyond what’s spelled out under the letter of the law where you live. And even with the rise of new privacy regulations, such as the EU’s general data protection rule (GDPR), the language remains intentionally ambiguous, such as this statement which appears on both MyHeritage and Ancestry:

We may also process your Personal Information on the basis of our legitimate interests, including in providing and improving the Services.

Who gets to determine what constitutes ‘improving the services’? The services, of course.

With the rapid growth of DNA testing, genealogical research is undergoing its most radical transformation since the dawn of the Internet. As the alternative paths of traditional and genetic genealogy research evolve our definition of family, the privacy questions surrounding who owns your data and your relationships with others (past, present and future) is an issue that will surely grow.  While larger sites like Facebook and Google will continue to get the bulk of attention and scrutiny, the for-profit genealogical services will inevitably be forced to evolve as well.

2 thoughts on “The data privacy threat lurking on your family tree”

  1. Sir,
    If you’re the author of the piece on LTG Thomas J.H. Trapnell, I’d like VERY much to speak/email you.

    Very Respectfully,
    Andrew Provorse
    Colonel, Special Forces
    US Army (Retired)

  2. Hi Hugh,

    I was hoping to get in touch with you, since it sounds like you are a descendant of Henry Lerner. He had a sister named Miriam who married Jack Schwartzman, and I’m a descendant of theirs. I’m at [email protected].



Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.