Case Study

Global Epidemiology Data Lake

Challenge

As a biotechnology company on the cutting edge of MRNA therapeutics, our customer needs detailed and low-latency data on COVID-19 cases and vaccination rates from health departments and organizations around the world in order to monitor the distribution and efficacy of their vaccines and to understand the spread and evolution of the virus.

Solution

data-lake-architecture-diagram

Rearc worked with the Biotech company's Data Engineering leadership and stakeholders from their DSAI team to gather requirements and understand the scope of data needed.

To develop the needed data products, Rearc expanded its data platform with sophisticated parsers, enabling the extraction of tabular data from PDF and ODS files supplied by health agencies.

Rearc used AWS Data Exchange with Redshift Data Sharing to deliver this data to the customer, enabling seamless ingestion of datasets into their Redshift Data Warehouse.

Outcome

  • Enabled insights that motivate further data requests: Rearc now provides the Biotech company with epidemiology data on viruses like RSV and Influenza.
  • Rearc now curates and delivers large-scale genomics and metagenomics data for this customer as well.
Next steps

Ready to talk about your next project?

1

Tell us more about your custom needs.

2

We’ll get back to you, really fast

3

Kick-off meeting

Let's Talk