23andMe says private user data is up for sale after being scraped

The 23andMe logo displayed on a smartphone screen.
Enlarge / The 23andMe logo displayed on a smartphone screen.

Genetic profiling service 23andMe has commenced an investigation after private user data was been scraped off its website

Friday’s confirmation comes five days after an unknown entity took to an online crime forum to advertise the sale of private information for millions of 23andMe users. The forum posts claimed that the stolen data included origin estimation, phenotype, health information, photos, and identification data. The posts claimed that 23andMe’s CEO was aware the company had been “hacked” two months earlier and never revealed the incident. In a statement emailed after this post went live, a 23andMe representative said “nothing they have posted publicly indicates they actually have any ‘health information.’ These are all unsubstantiated claims at this point.”

23andMe officials on Friday confirmed that private data for some of its users is, in fact, up for sale. The cause of the leak, the officials said, is data scraping, a technique that essentially reassembles large amounts of data by systematically extracting smaller amounts of information available to individual users of a service. Attackers gained unauthorized access to the individual 23andMe accounts, all of which had been configured by the user to opt in to a DNA relative feature that allows them to find potential relatives.

In a statement, the officials wrote:

We do not have any indication at this time that there has been a data security incident within our systems. Rather, the preliminary results of this investigation suggest that the login credentials used in these access attempts may have been gathered by a threat actor from data leaked during incidents involving other online platforms where users have recycled login credentials.

We believe that the threat actor may have then, in violation of our terms of service, accessed 23andme.com accounts without authorization and obtained information from those accounts. We are taking this issue seriously and will continue our investigation to confirm these preliminary results.

The DNA relative feature allows users who opt in to view basic profile information of others who also allow their profiles to be visible to DNA Relative participants, a spokesperson said. If the DNA of one opting-in user matches another, each gets to access the other’s ancestry information.

The crime forum post claimed the attackers obtained “13M pieces of data.” 23andMe officials have provided no details about the leaked information available online, the number of users it belongs to, or where it’s being made available. On Friday, The Record and Bleeping Computer reported that one leaked database contained information for 1 million users of Ashkenazi heritage, all of whom had opted in to the DNA relative service. The Record said a second database included 300,000 users of Chinese heritage who also had opted in.

The data included profile and account ID numbers, display names, gender, birth year, maternal and paternal haplogroups, ancestral heritage results, and data on whether or not each user has opted into 23andme’s health data. Some of this data is included only when users choose to share it.

The Record also reported that 23andMe website allows people who know the profile ID of a user to view that user’s profile photo, name, birth year, and location. The 23andMe representative said that “anyone who a 23andMe account who has opted into DNA Relatives can view basic profile information of any other account who has also explicitly optend into making their profile visible to other DNA Relative participants.”

By now, it has become clear that storing genetic information online carries risks. In 2018, MyHeritage revealed that email addresses and hashed passwords for more than 92 million users had been stolen through a breach of its network that occurred seven months earlier.
That same year, law enforcement officials in California said they used a different genealogy site to track down a long-sought suspect in a string of grisly murders that occurred 40 years earlier. Investigators matched DNA left at a crime scene with the suspect’s DNA. The suspect had never submitted a sample to the service, which is known as GEDMatch. Instead, the match was made with a GEDMatch user related to the suspect.

While there are benefits to storing genetic information online so people can trace their heritage and track down relatives, there are clear privacy threats. Even if a user chooses a strong password and uses two-factor authentication as 23andMe has long urged, their data can still be swept up in scraping incidents like the one recently confirmed. The only sure way to protect it from online theft is to not store it there in the first place.

This post has been updated to include details 23andMe provided.

https://arstechnica.com/?p=1974265