AWS launches Amazon Omics for precision medicine
AWS launches Amazon Omics for precision medicine Andrea Fox
To enhance clinical insights at the point of care and help identify the best treatment or prevention options for patients, Amazon Web Services has launched a service that utilizes artificial intelligence (AI), machine learning and other AWS and partner products and services to run IT-heavy bioinformatics workflows.
WHY IT MATTERS
Clinicians can query thousands of variants across many genes at once to understand how genomic variation, joined with corresponding clinical data, may affect human health or predict clinical outcomes, according to the AWS announcement.
Tehsin Syed, general manager of health AI and Dr. Taha Kass-Hout, vice president of machine learning and chief medical officer at AWS, say in their blog post that it's the size, rapid accumulation, complexity and heterogeneity of data that challenges existing computational tools used in precision medicine and research.
AWS built Amazon Omics to support large-scale genomic analysis and collaborative research for two main reasons, the company says:
- Generating insights from genomic, transcriptomic and other omics data poses difficulties for existing tools and systems trying to manage these workflows, Syed and Kass-Hout explained.
- The level of data required for sequencing also presents privacy, security, data ownership, governance and fairness challenges for the healthcare and life sciences sectors and must exist in a secure, compliant environment.
Amazon Omics users can reduce time spent on setting up and running complex Extract-Transform-Load pipelines by natively storing data in optimized query-ready formats (like, Apache Parquet) and APIs. They do not need to focus on provisioning the underlying infrastructure to operate their bioinformatics programs.
Customers can bring their existing workflows into the platform, which is compliant with HIPAA, GDPR and other data privacy regulations. Access control, logging and audit trails are also built in, according to AWS.
There are three components in Amazon Omics.
- Omics-Aware object storage is for raw sequence data.
- Omics Workflows runs is for processing raw sequence data at scale – either in the Omics Storage or in S3.
- Omics Analytics is for operating analytics through query-ready variants (or mutations) and annotations.
Omics users can combine data with other publicly available reference datasets in the Registry of Open Data on AWS, including the 1000 Genomes Project, which is used as a control to understand disease risk; the Genome Aggregation Database (gnomAD) which combines disease frequency data to improve disease detection; and more than 60 other genomic datasets.
Users want to search the clinical significance of DNA sequencing analyses – raw genomic variants in the form of Variant Call Files (VCF) – where results, like a faulty gene which produces a cancer-causing protein can be found.
With Amazon Omics, they can "import their VCF into a Variant Store and seamlessly transform them into a query-ready schema that is available as an Apache Iceberg Table. It also supports the import of variant annotations into an Annotation Store. Customers can govern access through AWS Lake Formation and apply fine-grained access control to filter out individual patients.
"This helps define custom patient cohorts and manage patient consent for compliance regimes, like GDPR, without having to copy the data. This also enables query and analysis of these variants using Amazon Athena and to join data from other modalities, such as clinical data in Amazon HealthLake or in the customer’s AWS Glue Data Catalog," said Syed and Kass-Hout.
Because the platform lives within the AWS ecosystem, partners like Lifebit and Ovation access large genomics data stores faster and can accelerate their work and innovate biomedical data solutions.
THE LARGER TREND
Relevant genomic inferences are hard to make when genomic data is inaccessible or not in readily usable formats, such as those that electronic health records have typically been able to surface.
Accessing troves of genomic data all in one place is what enables machine learning to make predictions about what the best therapy is for an individual patient, said John Quackenbush, chair of the Department of Biostatistics at the Harvard T.H. Chan School of Public Health, professor of computational biology and bioinformatics at Channing Division of Network Medicine and the Dana-Farber Cancer Institute and a former fellow on the Human Genome Project.
"We need to know something about the health and health status of each individual whose genome is sequenced if we ever want to get to the point where we can draw meaningful conclusions," he told Healthcare IT News ahead of a HIMSS keynote address on big data and analytics.
Since that time, investments in precision medicine have focused on technologies that can deliver computational power and decision support, including by EHRs.
Last year, Epic announced a collaboration with the Boston-based genomics profiling company Foundation Medicine to order, receive and view results within existing EHR workflows and Allscripts purchased 2bPrecise.
In July, Joel Diamond, CMO of 2bPrecise said that precision medicine has moved from the niche with genomics fast becoming the standard of care, such as in cancer treatment.
ON THE RECORD
"At Children's Hospital of Philadelphia, we know that getting a comprehensive view of our patients is crucial to delivering the best possible care, based on the most innovative research. Combining multiple clinical modalities is foundational to achieving this. With Amazon Omics, we can expand our understanding of our patients' health, all the way down to their DNA," Jeff Pennington, MSCS, associate vice president and chief research informatics officer at The Children's Hospital of Philadelphia, said in the announcement.
Andrea Fox is senior editor of Healthcare IT News.
Email: afox@himss.org
Healthcare IT News is a HIMSS publication.
1669748650