Blog
Building large scale research workflows with AI
Hywel Evans, Director, IQVIA NLP Insights Hub
Dec 08, 2023

The work of literature-based research can seem unending. As a response to ever increasing research volumes and the need to get more value from secondary research efforts, so called “living reviews” secondary research databases are emerging as ways to achieve a wide range of objectives for life science organizations. These living reviews enable multiple targeted reviews from a common database, provision of data for online data repositories and power internally focused insights and analytics for decision making.

To deliver such solutions, and they are solutions not simply projects, the focus must be on rapidly identifying, collating, appraising, and synthesizing evolving evidence on an important research topic on an ongoing basis. The goal is to enable timely influence on anything from evidence dissemination, patient care, evidence generation strategy, health policy and more. Living reviews can be time and resource-intensive, with the accumulation of new evidence and new developments within the review's research topic providing a cost and effort challenge.


How does technology support living reviews?

Using the latest technologies like AI, natural language processing (NLP) and machine learning it's possible to quickly extract data and summarize large volumes of text and information related to a research topic, like COVID-19, for example. A recent initiative by a top 10 pharma and IQVIA evaluated the use of NLP to rapidly extract real-world data (RWD) from vaccine and antiviral effectiveness studies. The ongoing study now incorporates generative AI methods to improve the accuracy and coverage of extracted variables. The resulting automated process allows us to use these powerful new methods in continuously populating rich, up-to-date, research databases.

With a workflow that blends the best of proven and innovative technology with the deep expertise of medical, research and epidemiology experts, it is possible to quickly extract and update information from new studies as they're published and ensure their accuracy and value. This is especially important in the context of COVID-19, where new information is constantly emerging.

There are several specific roles for AI & NLP methods in the workflow:

  1. Identify candidate studies with advanced search – Using context, negation and automatically expanded search terms with bio-medical synonyms, NLP can improve search recall and provide less noisy and more granular results.  NLP can also assist in finding rare correlations and studies that may have been overlooked or that contain uncommon patterns, ultimately improving precision. 
  2. Extract study meta-data, outcomes, patient populations and other data points – AI & NLP can automatically extract metadata and outcomes with a coverage rate of 60% to 80% of target variables. In very complex cases, AI and NLP can assist in manual extractions by suggesting relevant data points, highlighting meaningful areas of documents, and providing context, which leads to improved accuracy and consistency of data extraction. 
  3. Data Transformation – The resulting extractions can be formatted and configured for ingestion to a standardized data model or flat format for downstream applications, quality reviews or pushed to an interactive review tool. This allows organizations to easily integrate the extracted data into their existing systems and processes, ensuring that it is accurate and consistent. Additionally, the ability to push the data to interactive review tools allows stakeholders to explore and visualize the data, gaining deeper insights and identifying trends that may not have been apparent.
  4. Summarization – The latest AI models allow us to generate summaries of targeted elements of the research content. The outputs can act as draft content for authors, an assistant to reviewers or even comparative content for evidence synthesis. By having access to accurate summarized data, stakeholders can make more informed decisions regarding drug development, regulatory compliance, and marketing strategies.
  5. Access – The resulting data can be made available (where appropriate) to multiple teams, with training provided on methodology, so that the value can extend beyond the originating documents, requirements and use cases.

Considerations around the design of the workflow:

  1. Regulatory and methodology – Regulatory and other requirements may mean that technology plays a process efficiency role or acts as an AI-assistant to reviewers, with established processes and steps and human review all remaining, for example in the generation of Systematic Literature Reviews (SLRs). For other scenarios such as targeted literature reviews, internal analysis or data dissemination technology can play a larger role with some steps automated, however expert oversight is always in place, applied in a way that is specific to requirements.
  2. Scale and repeatability – The application of technology becomes truly impactful when the scale of the task takes advantage of automation and tools, and the selection and use of these need to consider the “size and shape” of the problem. For example, accurate extraction of a complex outcome that only appears in a limited number of papers may benefit from being flagged by AI but extracted by a reviewer. 

A key finding from our recent work is that expert review is essential to both refine the models used and have confidence in the quality and accuracy of the outputs. Here again technology can play a role, as we use the IQVIA Human Assisted Review Tool or ‘HART’ to allow reviewers to see and interact with the extracted data and variables from the completed AI and NLP extractions.

Overall, we have seen efficiency gains of 2-3x in deep data extraction, which is hugely impactful for pharma teams dealing with high-volume and complex research data. AI, NLP and machine learning methods as well as tools like HART provide an exciting opportunity to streamline living review process and similar document-powered workflows, while improving the accuracy and coverage of extracted information. With these tools, living reviews can become suitable and viable for a broader set of use cases, therapy areas and data sources and we are excited about the impact of the latest technology on this vital part of developing new treatments for patients. We recently presented this work at IDWeek to showcase the outcomes of the living review with AI.

Related solutions

Contact Us