Skip to main content
Case Studies

Extracting full value from Harmonized UK Biobank Data Investments and democratizing access to analysis ready data

Extracting full value from Harmonized UK Biobank Data Investments and democratizing access to analysis ready data

Situation

A leading pharmaceutical company received the UK Biobank cohort with a large amount (n= 500,000) of phenotypic data in support of target discovery on a deeper level. Company researchers sought to explore the UK Biobank dataset at scale, building complex queries that take multiple parameters into effect and further refine cohorts to match diseases of interest

Challenge

To extract its full value, the company’s data needed to be clean, easy to navigate and diverse. Ongoing curation and harmonization is also required to maintain the dataset and integrate it with other data that becomes available

In-house capabilities for data curation and harmonization were unable to efficiently curate and harmonize the data, forcing the company to accept lower data ROI unless a solution could be found

Action

Curation and Standardization

  • BC Platforms first worked with researchers to understand their needs, expectations and research questions
  • Based on client needs, BC Platforms configured our proprietary software platform to enable the client’s objectives
  • BC Platforms then ingested, curated and harmonized the n= 500 000 data set

Data Exploration & Management

To ensure the data set could be used to its full potential now and in the future, BC Platforms applied its Ontology Browser and BC|MATCH curation solution enabling additional data pipelines to be built for the dataset as the company’s efforts grow and expand

Impact

Deepened Target Identification: BC Platforms enabled researchers to build complex queries that take multiple parameters into effect, further refining cohorts to closer match the disease of interest and be more precise in targeting

Time & Resource Savings: Time from data release to usability was significantly reduced, from months to days. Additionally, ongoing management of the dataset was minimized as the dataset can quickly incorporate new datasets

Continued dataset ROI: subsequent data releases have been integrated and harmonized in under a week