Continuing our series on what is involved in practicing human data science, Yilian Yuan, Senior Vice President of Data Science and Advanced Analytics at IQVIA, describes what makes her team so unique in the healthcare industry. With a 22-year tenure at IQVIA Yilian is able to speak from vast experience on the challenges and nuances that separate human data science from data science in other business sectors. She also has the team to back her up; Yilian represents IQVIA’s 340 human data scientists around the globe – the largest concentration of such talent in the healthcare industry.
A Day in the Life of a Human Data Scientist
Not Your Average Statistics and Analytics
A Daunting Job Description
Virtually no one can walk in off the street and serve as a human data scientist. Although we approach analytics in much the same way as data scientists in other industries, there’s so much more to it. The competencies we require in a human data scientist team form a rather daunting list, including:
- Intimate knowledge of real world data (RWD). RWD comprises, most notably, de-identified, patient-level longitudinal data, as well as data from various other sources that are generated to process insurance claims, manage patient care, and promote health and fitness. Without a very deep understanding of what’s in these databases, what’s missing, and how they can and cannot be connected and applied, someone working with the data can easily misinterpret it, draw faulty conclusions and offer improper recommendations.
- Domain knowledge of treatment dynamics and commercial practices in the life sciences industry. Working with patient-level data requires a well-rounded understanding of the given therapeutic area. At a minimum, one must understand physician practices, how the typical patient journey progresses, what the treatment guidelines are, and what therapies compete. At the same time, you must know how brand teams operate in order to provide them with actionable recommendations. How do they, for instance, use sales analytics, marketing analytics, commercial analytics, and business analytics to make decisions?
- Experience with new, advanced analytical techniques. Patient-level data is big data and very complex. Traditional analytical tools and statistical methods often can’t handle them. Here, data demands machine learning (ML) and artificial intelligence (AI) to uncover the insights within the rich data with the volume and complexity. So, familiarity with Python, TensorFlow and other software tools for Artificial Intelligence Modeling Language (AIML) are required.
- Leveraging the latest computing technologies to make an impact. The analytical techniques mentioned above, especially applied to big data, are very computer intensive and require more powerful software and hardware. They require the latest technology (such as Spark, Scala, and Hadoop) to process the data and to execute the analytical steps and algorithms. To support significant advances in healthcare, we have to embed the AI/ML models into the companies’ workflows or the healthcare decision making process. The models must be adaptable and scalable for multiple disease areas or market situations where our clients are operating. We monitor the latest technologies so that we can adopt them to working with big data and making a big impact.
There are certainly data scientists in other organizations in other business sectors (such as financial services and retail for example) who are adept at using the latest analytical techniques and sophisticated computing technologies with other forms of big data. But they may not have the in-depth knowledge of the healthcare data and know-how to integrate various data assets and to make sense of it. This is probably why, when life sciences organizations attempt to build their own analytics team to do the type of customized analytics that we perform, they find that it is not that easy. Often, they realize that they need to partner with our human data scientists and, of course, the whole ecosystem of support we have around us at IQVIA.
We work closely with principals and the IQVIA sales community to help clients identify business issues and identify what healthcare data assets would be best in providing solutions. We collaborate with our technology teams to leverage the latest technology and work with our colleagues in production and data management teams to curate and connect data. It takes all of us working together to get to the answers that clients seek.
Breaking New Ground
It may surprise people to learn that the technical aspects of being a human data scientist are, for us, not really the most challenging part of the job; rather, they are one of the most exciting parts. Human data scientists thrive on opportunities to break new ground in developing new analytical approaches to improve patient care and our clients’ business. In fact, IQVIAN human data scientists already have several patents pending on our analytical techniques. Developing innovative analytic techniques to drive healthcare forward is a challenge we embrace.
Another aspect of the work that many of us welcome is the opportunity to work with new data sets – a benefit of the digital age and the generation of more and more (non-identified) human data. We’re already working with non-identified genomics data, patient registry data, data collected from physician networks, data from public sources such as health authorities, data from insurance companies, and unstructured data such as physician notes within electronic medical records.
I think that for most of the work we do, the most important part is the first step: What question needs to be answered? What business problem must be solved? Often, clients approach us by requesting a particular analysis using a particular data set. But, after we probe to understand the context and the underlying issue, we recommend using a different analysis or a different data source altogether. Yes, sometimes a human data scientist can help with forming the question, not just the insights.
When we’re working exclusively with IQVIA data, the project usually proceeds according to the plan, as the data is ready for us to work with in developing models and algorithms. Often, though, we work with a client’s data or data from an agency or another external source. When that’s the case, we have to get the data ready for analysis. It has to be cleaned, controlled for quality, formatted etc. We automate these tasks as much as possible using models, ML, and natural language processing (NLP). Even with the models we use to clean data, this step is very time consuming.
Perhaps the uniqueness of our work can best be appreciated through a few examples of completed projects. The human data scientists within our Data Science and Advanced Analytics group have recently:
- Developed a machine learning engine to identify physicians with patients whose disease had progressed to the point at which they need to move to the next line of treatment.
- Leveraged multiple data sources and developed multiple machine learning algorithms to understand the specific diagnosis driving prescriptions for products that are marketed for multiple indications, such as in oncology or autoimmune diseases.
- Used machine learning to reveal physicians’ responsiveness to digital engagement messaging as well as their communication preferences.
To learn more about these case studies, please visit our human data science page.
There’s a high bar for becoming a human data scientist – and there should be. The work is inherently different from that of other data scientists, and we must be equipped with very specialized knowledge, skills, and tools to derive the answers needed to improve human health and make a difference in peoples’ lives.