We have the Internet of Things, why not DNA, right? Let’s ponder over a scenario: Adam is a seven-year-old suffering from a disorder without a name. This very year, his physicians will begin sending his genetic information across the Internet to see if there’s anyone, anywhere, in the world who is like unto him. If by some force majeure there is a match, it could make all the difference. Adam is developmentally delayed, uses a walker, quips only a few words. Trouble is, he’s getting sicker. MRIs indicate that his cerebellum is diminishing. Thing is, his DNA goes through analysis by medical geneticists at a Children’s Hospital. Somewhere in the millions of As, Gs, Cs, and Ts is a mis-arrangement (these are the building blocks of DNA and RNA, you have heard of them in your biological studies-Adenine, Guanine, Cytosine, and Thymine), and maybe the clue to a treatment. But unless they find a second child with the same symptoms, and DNA error akin to Adam’s, then there isn’t much his doctors can do to zero in on which mistake in Adam’s genetic code is the crucial one. The above scenario is based on a true story that I read in an article, but I shall not go into the details.
At the beginning of the year, programmers in Toronto began testing a system for trading genetic information with other hospitals. These facilities, in locations including Miami, Baltimore, and Cambridge, U.K., also treat children with so-described ¬Mendelian disorders (you do remember who Mendel was, the so-dubbed father if modern Genetics) , which are caused by an obscure mutation in a single gene. The system, called MatchMaker Exchange, represents something new: a way to automate the comparison of DNA from sick people around the world. I like the sound of that, but weary at the same time-I just hope the scientific value will not go ahead of itself to sacrifice even the slightest sense of decency or morality.
So who is This Guy Haussler?
One of the people behind this project is David Haussler, a bioinformatics expert based at the University of California, Santa Cruz. Haussler has to contend with the fact that genome sequencing is largely detached from our greatest tool for sharing information: the Internet. That’s unfortunate because more than 200,000 people have already had their genomes sequenced (I want my genome sequenced then I can pay Google $25 to store my DNA, right? Wrong, no company is storing my DNA, what if they make a replica of me or sells my DNA for Research, or whatever, those are valid concerns, yeah?), a number certain to rise into the millions in years ahead. The next era of medicine is predicated on large-scale comparisons of these genomes, a task for which our good fellow Monsieur Haussler thinks scientists are poorly prepared, and the business opportunities are enormous according to me. “I can use my credit card anywhere in the world, but biomedical data just isn’t on the Internet,” he says. “It’s all incomplete and locked down.” Genomes often get moved around in hard drives and delivered under strict supervision.
Global Alliance for Genomics and Health
Haussler is a founder and one of the technical leaders of the Global Alliance for Genomics and Health, a nonprofit organization formed in 2013 that compares itself to the W3C (the standards organization devoted to making sure the Web functions correctly). Also known by its punning acronym, GA4GH, it’s gained a large membership; including major technology companies like Google (remember the $25 fee you have to pay to store your Genome). Its products so far include protocols, application programming interfaces (APIs), and improved file formats for moving DNA around the Web. However, the real problems it is solving are mostly not technical. Instead, they are sociological: scientists are reluctant to share genetic data, and because of privacy rules, it’s considered legally risky to put people’s genomes on the Internet.
The interesting thing is that pressure is mounting to use technology to study tons of genomes at once and begin to compare that genetic information with medical records. This is because scientists think they’ll need to sort through a million genomes or more to solve cases—like Adam’s—that could involve a single rogue DNA letter, or to make discoveries about the genetics of common diseases that involve a complex combination of genes. At the moment, no single academic center currently has access to information that extensive, or the financial means to assemble it. So the Internet of DNA is a pretty good idea. It could be like your email, with login details to track your Genome, how it is being accessed, and if so, is the ample compensation, that’s just my idea.
Haussler and others at the alliance are betting that part of the solution is a peer-to-peer computer network that can unite widely dispersed data, like how it works with bitcoin, I am thinking. Their standards, for instance, would permit a researcher to send queries to other hospitals, which could choose what level of information they were willing to share and with whom. This control could ease privacy concerns. Adding a new level of complexity, the APIs could also call on databases to perform calculations—say, to reanalyze the genomes they store—and return answers, right?
I am worried that genomics is drifting away from the open approach that had made the genome project so potent and inspiring. If people’s DNA data is made more widely accessible, I hope, medicine may benefit from the same nature of “network effect” that’s propelled so many commercial aspects of the Web. The alternative is that this vital information will end up marooned in something like the disastrous hodgepodge of hospital record systems in the Kenyan Medical System.
One argument for quick action is that the amount of genome data is exploding. The largest labs can now sequence human genomes to a high refinement at the pace of two per hour, with that mind knowing that the first genome took about 13 years. Back-of-the-envelope calculations suggest that fast machines for DNA sequencing will be capable of producing 85 petabytes of data this year worldwide, twice that much in 2019, and so forth. For comparison, all the master copies of movies held by Netflix take up 2.6 petabytes of storage. (The Rise of the Quantified Self.)
“This is a technical question,” says Adam Berrey, CEO of Curoverse, a Boston startup that is using the alliance’s standards in developing open-source software for hospitals. “You have what will be exabytes of data around the world that nobody wants to move. So how do you query it all together, at once? The answer is instead of moving the data around, you move the questions around. No industry does that. It’s an insanely hard problem, but it has the potential to be transformative to human life.”
Currently scientists are broadly engaged in what is, in effect, a project to document every variation in every human gene and determine what the effects of those differences are. Individual human beings differ at about three million DNA positions, or one in every 1,000 genetic letters. Most of these differences don’t matter, but the rest explain many things that do: heartbreaking disorders like Adam’s, for instance, or a higher than average chance of developing glaucoma, especially if you are African.
Thus, imagine that in the near future, you had the bad luck to develop cancer (the number is elevating in Kenya). What could happen is a doctor might order DNA tests on your tumor, knowing that every cancer is propelled by specific mutations. If it were feasible to look up the experience of everyone else who shared your tumor’s particular mutations, as well as what drugs those people took and how long they lived, that doctor might have a good idea of how to treat you. “The limiting factor is not the technology,” says David Shaywitz, chief medical officer of DNAnexus, a bioinformatics company that hosts several large collections of gene data. “It’s whether people are willing.”
In the summer of 2014, Haussler’s alliance launched a basic search engine for DNA, which it calls Beacon. Currently, Beacon searches through about 20 databases of human genomes that were previously made public and have implemented the alliance’s protocols. Beacon offers only yes-or-no answers to a single type of question. You can ask, for instance, “Do any of your genomes have a T at position 1,520,301 on chromosome 1?” “It’s really just the most basic question there is: have you ever seen this variant?” says Haussler. “Because if you did see something new, you might want to know, is this the first patient in the world that has this?” Beacon is already able to access the DNA of thousands of people, including hundreds of genomes put online by Google.
One of the co-founders of the Global Alliance is David Altshuler, who is now head of science at Vertex Pharmaceuticals but until recently was deputy chief of the MIT-Harvard Broad Institute, one of the largest academic DNA-sequencing centers in the United States. Altshuler has his own reasons for wanting to connect massive amounts of genetic data. As an academic researcher, he hunted for the genetic causes of common diseases like diabetes. That work was carried out by comparing the DNA of afflicted and the non-afflicted, trying to spot the variations that materialize most often. After burning through countless research grants utilizing this long, tedious way, geneticists realized there would be no easy answers, no common “diabetes genes” or “depression genes.” It turns out that common diseases aren’t caused by single, smoking-gun defects. Instead, a person’s risk, scientists have learned, is determined by a combination of hundreds, if not tens of thousands, of rare variations in the DNA code. Sometimes it kind of looked like they were playing “shoot-the-arrow” in the dark.
What are the concerns?
This has created a huge statistical migraine. Late summer saw a report listing 300 authors, Broad (remember MIT and Harvard) observed at the genes of 36,989 people with schizophrenia. We all know that schizophrenia is highly heritable ailment; the 108 gene regions identified by the scientists explained only a small percentage of a person’s susceptibility for the disease. Altshuler believes that enormous gene studies are still a good way to “break” these disturbed codes, but he thinks it will probably take millions of genomes to do it, I can think so, too, because the many there are, the greater the chances of finding the right pieces to complete the puzzle.
Sharing data no longer looks optional, if one is going by the quantified depiction, whether researchers are trying to unravel the causes of common diseases or ultra-rare ones. “There’s going to be an enormous change in how science is done, and it’s only because the signal-to-noise ratio necessitates it,” says Arthur Toga, a researcher who leads a consortium studying the science of Alzheimer’s at the University of Southern California. “You can’t get your result with just 10,000 patients—you are going to need more. Scientists will share now because they have to.” I hope this collaboration is replicated in Kenya and the greater African region.
Privacy, of course, is an obstacle to sharing. People’s DNA data is protected because it can identify them, like a fingerprint—and their medical records are locked, too. In some countries, exporting personal information for research is not permitted. But Haussler thinks a peer-to-peer network can sidestep some of these worries, since the data won’t move and access to it can be gated and armed. More than half of Europeans and Americans say they’re comfortable with the idea of sharing their genomes, and some researchers believe patient consent forms should be dynamic, a bit like Facebook’s privacy controls, letting individuals decide what they’ll share and with whom—and then change their minds. “Our members want to be the ones to decide, but they aren’t that worried about privacy. They’re sick,” says Sharon Terry, head of the Genetic Alliance, a large patient advocacy organization. Privacy comes later when you are all feeling prim and proper, aye!
The risk of not getting data sharing right is that the genome revolution could sputter and fall of into oblivion of maladies and corporate profits. Some researchers say they are seeing signs that it’s happening already. Kym Boycott, head of the research team that sequenced Adam’s genome, says that when the group adopted sequencing as a research tool in 2010, it met with immediate success. Over two years, between 2011 and 2013, a network of Canadian geneticists uncovered the precise molecular causes of 146 conditions, solving 55 percent of their undiagnosed cases.
Nonetheless, the success rate appears to be tailing off, says ¬Boycott. Now it’s the tougher cases like Adam’s that are left, and they are getting solved only half as often as the others, a shiver just went down my spine. “We don’t have two patients with the same thing anymore. That’s why we need the exchange,” she says. “We need more patients and systematic sharing to get the [success rate] back up.” In late January, when asked if MatchMaker Exchange had yielded any matches yet, she demurred, saying that it could be a matter of weeks before the software was fully operational. As for Adam, she said, “We are still waiting to sort him out. It’s important for this little guy.”
The unfolding calamity in genomics is that a great deal of life-saving information, though already available, is inaccessible. So the Internet of DNA provides an exciting new way to combat the onslaught of ‘weirdo’ and non-weirdo diseases. I just hope that vested interest doesn’t take the place of genuine concern for patients who seriously need help. Otherwise, hurrah for Science!