We use cookies essential for this site to function well. Please click “Accept” to help us improve its usefulness with additional cookies. Learn about our use of cookies, and collaboration with select social media and trusted analytics partners here

EnergyTech
Maximizing energy output using renewables equipment
ShipTech
Optimizing vessels' engines for greener maritime
HealthTech
Enabling remote diagnostics using data from wearables
Life Sciences
Utilizing AI across the medical treatment value chain

Boosting genetics research

Speak with an AI/ML expert in your industry
Schedule a call
View our cases to discover what we have reaserched
Boosting genetics research - 1

About customer

Our customer company is a genomics service company focused on general NGS and microarray research with a proprietary technology platform.

Challenge

Our client started a research program to understand genetic factors and their variations with respect to the SARS-CoV-2 infection development. The main focus was on sequencing patients’ data and identification of the genetic factors that may cause or correlate with disease outcomes.

Technologically, the company collected WGS data and performed viral genome sequencing, which allowed host respectability analysis, disease severity, host-virus interactions, and other factors. Huge datasets and a need to test several research hypotheses in parallel and quickly were the biggest challenge at the beginning of the project. Of course, cost-effective and secure storage alongside the ability to make fast queries over the research data were among the requirements as well.

Solution

Since the high load was the biggest challenge, we focused on cloud platform development. To allow parallel task processing we have built a workflow orchestrator that automatically schedules, monitors, and scales processing. All the results from the analytics queries ran on separate VMs were stored in a cloud object storage which is scalable with respect to the amount of data and protects results from unauthorized access with encryption features and access management tools. Cost-effectiveness was met with data lifecycle policies and automatic workflow orchestration. The OLAP data warehouse provided fast analytical queries on structured data.

Result

Before

  • Unstructured WGS datasets were stored in an on-premise ecosystem with an analytics speed bottleneck that became crucial with the data amount growth
  • Inability to analyze host susceptibility, clinical outcomes, and severity of the disease, host-virus interactions based on the WGS data

After

  • A secure cloud platform tailored for parallel data processing running in production and supporting hundreds of simultaneous analytical tasks processing
  • A human-friendly UX interface for fast analytical queries