About the Report
This report utilizes the IQVIA Genomic Initiatives Database — a new database of genomic data initiatives — to examine and segment the global genomic landscape. In the COVID-19 era, the role genomic databases play is growing. For instance, the UK Biobank and Finngen, among others, are currently involved in efforts to understand the genetic determinants underlying patient susceptibility to COVID-19 and severe response to infection.1,2 They are also working with the COVID-19 Host Genetics Initiative3 to help COVID researchers generate, share, and analyze data globally. The publication of this first landscape of genomic data initiatives provides a baseline on the important use of human genomic data — a point from which to measure progress building insight into the origins of human disease and developing better therapeutics to prevent and treat diseases. With it, stakeholders can gain a better picture of the current genomic landscape.
The landscape of initiatives to generate and collect human genomic data is evolving rapidly and is highly diverse, with private and public initiatives across multiple countries. Since 1990, the cost of sequencing a whole genome dropped from $2.7 million to as low as $300, opening new opportunities to build repositories of genomic data. By the start of 2020, there were 187 genomic initiatives globally of which 50% originated in the U.S. and 19% in Europe. Thirty-eight million genomes had been analyzed using techniques ranging from genotyping to whole genome sequencing, and this number is expected to grow to 52 million by 2025.
Planned national genomic databases are proliferating as countries increasingly appreciate the potential technological and healthcare system benefits of genomic data, with some countries planning to sequence their entire population. The medical utility of data varies across initiatives based on the number of genomes collected, the completeness of genomic data, linkage to other healthcare-relevant data, and the disease or populations it covers. Although only 42% of databases publicly state their genomic data links to patient demographic information or clinical data and only 28% state links to the most valuable EMR/EHR and clinical data, an analysis of initiatives’ data quality versus their target cohort sizes suggests that the next decade will see an increasing number of large genomic databases with strong utility for human data science and medical research.
1 Morelle R. UK Biobank: DNA to unlock coronavirus secrets. BBC News. 14 Apr 2020. Available from: https://www.bbc.com/news/health-52243605
2 Finngen. Finngen involved in the Covid19 study. Accessed May 4, 2020. Available from: https://www.finngen.fi/fi/finngen_mukana_covid19_tutkimuksessa
3 The COVID-19 Host Genetics Initiative. Available from: https://www.covid19hg.org/