When the copy is as good as the real thing: The (forged) art of Generative Adversarial Networks

July 13, 2018
Chris Steel
Chris Steel, Senior Director, Machine Learning / Artificial Intelligence
Harry Wang, Duke University
Key Highlights
Generative Adversarial Networks epitomize the potential of machine learning in healthcare. Safer drug development. Better (faster) understanding of disease progression. But we still have a long way to go, and the role of the human in machine learning – as ever – cannot be underestimated. 
It’s hard to deny that machine learning and artificial intelligence is on the rise. In fact, it’s seemingly everywhere, largely as a result of faster computational processes and more optimized methods that have accelerated technologies (new and existing) in all sectors of the modern world. 

One such technology are Generative Adversarial Networks (GANs), introduced in 2014 by a group of researchers, led by Ian Goodfellow, at the University of Montreal. GANs are a fascinating and innovative combination of game theory and deep learning through the conjunction of two competing (“adversarial”) neural network models. The idea is to use one model to generate samples (the generator), and another model (the discriminator) to take samples from both the generator and the training data, and then discern between the two. These two models are trained concurrently to the point where the generated samples are indistinguishable from real data by the discriminator. A common analogy is a forger (generator) who learns to forge art until an art expert (discriminator) can no longer recognize the forged art from the real art. 

Needless to say, the potential is immense, and its application to solving key questions in healthcare is tantalizing. 

One notable project underway comes from a group of researchers at the Department of Biomedical Informatics at Columbia University, who are attempting to use GANs to predict drug-induced laboratory test trajectories. 

In another area, Neuromation, a company specializing in AI advancements in medicine, has attempted to use GANs to do drug discovery – creating and identifying lead molecules that are most likely to have desired properties. 

IQVIA is even pioneering investigation into the potential for GANs to generate synthetic patient data to overcome issues related to privacy and sparse data such as the case with rare diseases. Since data for these diseases are sparse to begin with and most machine learning models need a certain threshold of data to become accurate, GANs could be useful to generate synthetic data that can be as realistic as possible. This would be a watershed moment for an area that struggles desperately to gather enough data to understand diseases and help patients in need. It would also be an important step forward in patient privacy and solving long-standing concerns over protected health information (PHI). 

Of course, while it’s easy to be intoxicated by the possibilities, it’s important to temper our excitement and be realistic about the current limitations. Notably, the standard caveats of machine learning still apply, especially regarding how the data used in training must be a complete representation of the predictors. 

Additionally, it is possible that the results the GANs create could be physically impossible or not feasible when dealing with drug creation or health record generation. This is due in part to the models being unable to “explain” why their output is correct. And so GANs require skill domain experts (human data scientists, in fact) that understand the healthcare space as well as the science. A simple example of this is using GANs to generate film screenplays. Although a well modeled GAN could create a correctly structured screenplay, the plot might not be well developed and may be prone to continuity errors (as happened with this sci-fi film – although the model that produced this screenplay did not have a discriminator model to validate it). It is but another important example of the relevance – and even the necessity – of domain expertise and human oversight in the development of truly optimal machine learning technologies. 

Of course, all of this is just the tip of the iceberg in the conversation of how machine learning can improve, develop, and optimize the healthcare sector. The technology and potential of GANs is just developing, but its impact is already undeniable. 

Expand For More