Last week, a paper was published by Craig Venter and colleagues that claimed to predict people’s physical traits from their DNA.

The approach used in the study, correctly identified an individual out of a group of ten people randomly selected from the Human Longevity, Inc (HLI) database 74% of the time, as a result of genome sequencing. The findings suggest that law-enforcement agencies, scientists and others who handle human genomes should protect the data carefully to prevent people from being identified by their DNA alone.

After sequencing the whole genomes of 1,061 people of varying ages and ethnic backgrounds, the researchers used an artificial intelligence approach to find small differences in DNA sequences, called SNPs, associated with facial features such as cheekbone height.

However, according to Scientific American, geneticists who have since reviewed the paper believe the claim is vastly exaggerated. Mark Shriver, an anthropologist at Pennsylvania State University in University Park, explained, “I don’t think this paper raises those risks, because they haven’t demonstrated any ability to individuate this person from DNA.”

In a randomly selected group of ten people, especially one chosen from a data set as small and diverse as HLI’s, knowing age, sex and race alone rules out most of the individuals, he explained.

To demonstrate this, computational biologist Yaniv Erlich of Columbia University in New York City looked at the age, sex and ethnicity data from HLI’s paper. In a study on the preprint server bioRxiv, he calculated that knowing only those three traits was sufficient to identify an individual out of a group of ten people in the HLI data set 75% of the time. Erlich contends that there was no need to know anything about the people’s genomes. Furthermore, he added, HLI’s reconstructions of facial structure from SNPs are not highly specific, they tend to look as much like an individual as anyone of that person’s sex and race.

To add fire to the fuel, Jason Piper, a computational biologist and a paper co-author who now works at Apple in Singapore, agrees that the paper misrepresents the findings that he and the other co-authors produced.

On Twitter, he expressed his view that in his opinion, HLI has a potential conflict of interest in encouraging restricted access to DNA databases. “I think genetic privacy is very important, but the approach being taken is the wrong one,” Piper said. “In order to get more information out of the genome, people have to share.” A more useful approach, he said, would be to find a way to make genomic data public without allowing individuals to be identified. 

Responding to recent criticism, the company released a statement commented, “HLI stands by the protection of genome data and the promotion of modern solutions for data exchange.”