At-home ancestry testing: A microcosm of the state of diversity in science

Genetics. Genomic sequencing. DNA. While the technical aspect of genetic testing may appear daunting to most, the personal touch of at-home testing kits has made companies, like 23andMe, millions of dollars. The kits are marketed as a novel, individual adventure and as gifts for birthdays, the holiday season, and Father’s and Mother’s Day. The aspects of identity and family origins included in these services further personalize their products, with results having the potential to open consumers’ eyes to heritage that they may have never known about otherwise. Results can have significant personal meaning and cultural implications for customers. With this in mind, they should be relatively accurate, right?

The aspects of identity and family origins included in these services further personalize their products, with results having the potential to open consumers’ eyes to heritage that they may have never known about otherwise.

For proper context, we must first understand the basic schematic of 23andMe’s genetic testing methods. After obtaining a sample, the service focuses on the customer’s single-nucleotide polymorphisms, or SNPs. SNPs are single-base copying errors that are inherited from one generation to the next and are therefore great indicators for measuring genetic events in human populations. Genetic testing services detect and compare SNPs using a microarray chip, a tool covered with probe sequences that are engineered to be complementary to certain SNPs. When exposed to the microarray, the SNPs that researchers are looking for will adhere to the chip and then be visualized. As of 2017, the company uses the NCBI Human Reference Genome to compare SNP patterns and Illumina’s Global Screening Array v5 chip, which tests for around 650,000 individual SNPs. The upgrade to this chip yielded improvements regarding ethnicity testing because of its expanded capacity and accuracy when analyzing non-European samples and its data’s ease of use across multiple platforms and testing companies.

For example, European customers’ results have the potential to pinpoint their ancestry down to a country or town of origin, while the Native American subgroup spans all of North, Central, and South America. 

Despite the chip upgrade, certain inherent limitations and characteristics of 23andMe’s testing methods seem to fall short with regard to diversity. The current 45 reference populations that customers’ ethnicity results are sorted into are mostly found within Europe, with 15 populations found under the European umbrella. Europe also has almost a third of 23andMe’s 150 recent ancestor locations, which give users insight into where their ancestors might have lived in the past two centuries, with 40 falling within the categories of “Northwestern European,” “Southern European,” “Eastern European,” “Ashkenazi Jewish,” and “Broadly European.” The other global regions, “Central and South Asian,” “East Asian and Native American,” “Subsaharan African,” “Western and North African,” and “Melanesian” all compose the remaining two-thirds of the system’s categories. Subsequently, the level of precision and detail when it comes to data sets that originate from these areas is much lower than samples from Europe. For example, European customers’ results have the potential to pinpoint their ancestry down to a country or town of origin, while the Native American subgroup spans all of North, Central, and South America. 

At-home testing kits may sound like a novelty, but this field’s location at the intersection of genetics, data privacy, bioethics, and representation is far from inconsequential. 

What’s the reason for this huge degree of discrepancy? It mostly comes down to availability and amount of reference data per population. With more data sets from one general geographic region of origin, 23andMe can elaborate and create more subgroups by cross referencing with the sequences they know of. This positive feedback loop is evident in their European schematic and provides a reason for the lack of similar depth in other populations: a relatively homogeneous, white pool of genetic testing participants. While 23andMe and similar services have technical ethnic accuracy under their belts, the industry has some work to do regarding precision, detail, and diversity. At-home testing kits may sound like a novelty, but this field’s location at the intersection of genetics, data privacy, bioethics, and representation is far from inconsequential. 

Sources: 1 // 2 // 3 // 4

Image Source: Flickr