title:: How to build a global, scalable, low-latency, and secure machine learning medical imaging analysis platform on AWS

publisher:: Amazon Web Services (AWS)

people:: Bram van Ginneken, Razvan Ionasec, James Meakin, Pablo Nuñez Pölcher

organization:: Diagnostic Image Analysis Group (DIAG), Radboud University Medical Center, Amazon Web Services (AWS), Mevis Medical Solutions

domain:: Healthcare, Machine Learning, Medical Imaging, Cloud Computing

link:: https://aws.amazon.com/blogs/industries/how-to-build-a-global-scalable-low-latency-and-secure-machine-learning-medical-imaging-analysis-platform-on-aws/

publications

Summary

This article describes how the Diagnostic Image Analysis Group (DIAG) at Radboud University Medical Center migrated their grand-challenge.org platform to AWS to enable global collaboration on machine learning solutions for medical imaging. Key capabilities enabled by AWS include:

Sharing medical imaging data globally via Amazon S3 with data validation and conversion tools
Low-latency, scalable web-based medical image viewer using Amazon EC2, Docker containers, and Amazon CloudFront
Secure, cost-effective deployment and distribution of machine learning models via Docker, Amazon SQS, and GPU instances
Rapid migration of existing data and compute to the cloud using AWS services

The architecture leverages AWS services like Amazon S3, Amazon CloudFront, Amazon EC2, Amazon SQS, and enables capabilities like collaborative data annotation, model sharing, on-demand scaling, and browser-based viewing across the globe.

Data Points

grand-challenge.org is an open platform for machine learning challenges in biomedical imaging
Hosted 45,000+ researchers collaborating on novel ML solutions
Migrated from on-premises to AWS for scalability, global access during COVID-19 pandemic
Used Amazon S3 for storing imaging data, allowing uploads from global sites
Developed low-latency viewer using Docker containers on Amazon EC2 across regions
Amazon CloudFront enabled fast data distribution to viewer instances globally
Researchers can share ML models as Docker containers, executed via Celery/Amazon SQS
GPU instances enabled for ML workloads, could scale on-demand
Rapidly migrated by syncing data to S3, moving database to RDS, broker to SQS
Developing AWS Batch solution for secure, isolated algorithm execution by users
Applied to CT scoring for COVID-19 diagnosis, achieved high accuracy (ROC 0.95)

Click here to return to Amazon Web Services homepage

About AWS Contact Us Support English My Account

Create an AWS Account

Products Solutions Pricing Documentation Learn Partner Network AWS Marketplace Customer Enablement Events Explore More

عربي Bahasa Indonesia Deutsch English Español Français Italiano Português

Tiếng Việt Türkçe Ρусский ไทย 日本語 한국어 中文 (简体) 中文 (繁體)

My Profile Sign out of AWS Builder ID AWS Management Console Account Settings Billing & Cost Management Security Credentials AWS Personal Health Dashboard

Support Center Expert Help Knowledge Center AWS Support Overview AWS re:Post

Click here to return to Amazon Web Services homepage

Get Started for Free

Products Solutions Pricing Introduction to AWS Getting Started Documentation Training and Certification Developer Center Customer Success Partner Network AWS Marketplace Support AWS re:Post Log into Console Download the Mobile App

AWS Blog Home Blogs Editions

Architecture AWS Cloud Operations & Migrations AWS for Games AWS Insights AWS Marketplace AWS News AWS Partner Network AWS Smart Business Big Data Business Intelligence Business Productivity Cloud Enterprise Strategy Cloud Financial Management Compute Contact Center Containers Database Desktop & Application Streaming Developer Tools DevOps Front-End Web & Mobile

HPC IBM and Red Hat Industries Integration & Automation Internet of Things Machine Learning Media Messaging & Targeting Microsoft Workloads on AWS .NET on AWS Networking & Content Delivery Open Source Public Sector Quantum Computing Robotics SAP Security Spatial Computing Startups Storage Supply Chain & Logistics Training & Certification

中国版日本版 한국 에디션 기술 블로그 Edisi Bahasa Indonesia AWS Thai Blog Édition Française Deutsche Edition Edição em Português Edición en Español Версия на русском Türkçe Sürüm

AWS for Industries

How to build a global, scalable, low-latency, and secure machine learning medical imaging analysis platform on AWS

Introduction It is hard to imagine the future for medical imaging without machine learning (ML) as its central innovation engine. Countless researchers, developers, start-ups, and larger enterprises are engaged in building, training, and deploying machine learning solutions for medical imaging that are posed to transform today’s medical workflows and the future value of imaging in diagnosis and treatment. To reach scientific breakthroughs, researchers need first to overcome several obstacles when training and deploying machine learning models. First, they must access large volumes of data stored in disjointed registries that are located in different parts of the world. Second, they need to deploy standardized tools globally to generate ground truth on reference datasets. Finally, they need to configure a secure and cost-effective environment to allow for collaboration between research groups. That is why Diagnostic Image Analysis Group (DIAG) at the Radboud University Medical Center in Nijmegen, The Netherlands, turned to AWS to migrate their grand-challenge.org open-source platform from their on-premises data center to the cloud. Grand-challenge.org was established in 2012 for the organization of machine learning challenges in biomedical image analysis, and today brings together 45,000+ registered researchers and clinicians from all over the world to collaborate on creating novel ML solutions in the field. When in early March 2020 it was hypothesized that CT imaging could play an important role in the diagnosis and assessment of COVID-19, the Dutch Radiological Society rapidly proposed a standardized assessment scheme for CT scans of patients with suspected COVID-19 called CO-RADS. And radiologists turned to the grand-challenge.org platform to collect imaging data and to use the platform’s browser-based viewing system for CT scans to assess the performance of the CO-RADS model, which achieved a high discriminatory power for diagnosing COVID-19 from a CT scan alone (ROC 0.91, 95% CI, 0.85-0.97, for positive RT-PCR results)(1) On the platform, DIAG has made the COVID-19 dataset, the training course to teach radiologists how to assess a scan using CO-RADS, the exam and the ML model available to all registered users. However, grand-challenge.org was running in an on-premises data center, and the experience of the radiologists using the course outside of Europe was poor due to latency of the server-side rendered viewing systems, and the number of scans DIAG could process with our AI tools was limited by the amount of hardware we provisioned before the emergence of SARS-Cov-2. In April 2020, the collaboration between DIAG and AWS started to bring globally distributed browser-based viewing systems and elastic scaling to make these tools available to machine learning and clinical researchers worldwide. Through a successful collaboration between DIAG’s Research Software Engineering team and AWS the grand challenge platform was able to be migrated to the cloud in less than two weeks from the start of the project. Several technical hurdles were overcome, resulting in a more robust, performant, and scalable application that will continue to support the medical imaging community during this pandemic and beyond. This work presents the architecture and services used for the global medical imaging analysis platform and explains the challenges, solutions, and results obtained including 1) exchange data with the global research community, 2) low-latency and scalable web-based viewer, 3) secure and cost-effective deployment & distribution of machine learning models, and 4) Rapid migration to cloud of data and compute. Exchange data with the global research community Developing robust machine learning solutions to problems in biomedical imaging requires access to large amounts of annotated training data. The volume of data generated by medical instruments such as MRI and CT scanners, next-generation sequencers, and digital pathology machines steadily increases as sensors become more accurate and systems more sophisticated in characterizing physiology. The massive data generated is locked in siloed databases and proprietary formats. The exchange of data and collaboration on research projects beyond the boundaries of an institution remains a challenge from a technical as well as compliance and security perspective. On grand-challenge.org DIAG has added functionality so that researchers can set up archives to easily share data with each other, apply algorithms to that data, and set up their own reader studies to invite experts to annotate the data. In medical imaging, shipping HDDs across sites is the norm, but AWS enabled the use of direct upload to Amazon Simple Storage Service (Amazon S3) with accelerated transfers to gather data from sites globally. Users are able to upload data in a variety of medical imaging formats including DICOM and a variety of whole slide image formats. These data are automatically validated and converted to MetaImage or TIFF as this is much easier for the machine learning researchers to work with. Amazon S3 is ued to store all of the imaging data on the grand-challenge.org platform. Now, DIAG does not need to worry about scaling the storage after the increased data influx from scans of patients with suspected COVID-19. To allow for fast access to the data we use Amazon CloudFront and easily integrate URL signing with the Django backend so that users are only able to download the files for the images that they have permission to view.

Low-latency and scalable web-based viewer Today, most viewing and processing of medical imaging data in both clinical and research environments happens on-premises on dedicated workstations capable of server-side rendering necessary for routine manipulations such as MIP (maximum intensity projection) viewing or 3D volumetric rendering. With the increased collaboration between radiologists from many institutions spread across the world as well as the rise in secondary usage of medical imaging for research and development of ML solutions, there is a need for globally available solutions. This very challenge was faced recently by Radboud University Medical Center in Nijmegen as they received great interest from radiologists worldwide for their CO-RADS Academy solution that teaches physicians how to read COVID-19 CT images. The Diagnostic Image Analysis Group (DIAG) developed a web-based medical imaging viewer called CIRRUS, which is built on MeVisLab from MeVis Medical Solutions. CIRRUS enables the use of many tools that radiologists require for interacting with medical imaging data. Server-side rendering is used for rapid loading of the medical imaging data and allows the use of powerful rendering hardware for 3D multiplanar reformation, pre-loading of series in memory and GPU acceleration. The rendered scenes are streamed to the client over a WebSocket connection to a VueJS single page application to also gain the strengths of client-side interactions where necessary. These workstations are deployed using Docker containers, and one container image is launched per user with the users being routed to their container instance with Traefik. In this project, DIAG was able to set up rendering servers on AWS in Europe, Japan, and North America on Amazon Elastic Compute Cloud (Amazon EC2). To start the container on-demand for a new user, it takes less than 30sec and we are able to horizontally scale the compute pool by adding additional EC2 instances in each Region. The medical imaging data are stored in an Amazon S3 bucket in Europe. To ensure rapid loading times in North America and the Asia Pacific Regions we used Amazon CloudFront to cache the data on demand. The loading performance for a typical 300MB CT, 500 slices CT studies is less than 10 seconds. With a latency of 20ms, there is no observer delay in scrolling, which provides a great user experience.

Secure and cost-effective deployment & distribution of machine learning models Researchers need to have the freedom to use whatever tool or library is most appropriate for their use case, and often find it difficult to distribute their models to the rest of the research community. On grand-challenge.org this gap is bridged by allowing researchers to upload their developed model and pre-processing pipeline as a Docker container image, where they can manage the users who can access the algorithm. This allows researchers to share their algorithms with the community, where the platform will handle authentication, authorization, data access, validation and conversion of DICOM to MetaImages, and execution of the containers on the data with GPU acceleration. The grand-challenge.org platform uses Celery to schedule the jobs for these containers and medical images, providing GPU acceleration where needed with NVIDIA T4 cards. DIAG was able to reduce the number of services it manages by using Amazon Simple Queue Service (Amazon SQS) as the message broker and can now horizontally scale by adding workers that listen to each queue. It is also able to run across its existing provisioned hardware and during periods of increased demand start extra g4dn EC2 instances. This has enabled researchers to rapidly deploy a model for automated scoring of CT scans with CO-RADS and is available at https://grand-challenge.org/algorithms/corads-ai/. Users are able to upload their own data and receive a prediction on this data within 2 minutes, and then inspect the results in the globally available browser-based workstations.

Image courtesy of Radboud University Medical Center in Nijmegen Rapid migration to cloud of data and compute DIAG had previously been running grand-challenge.org in its on-premises data center, and did not foresee the scale at which it would need to operate during this pandemic. However, when developing the application DIAG had the cloud in mind, and tried to ensure that workloads were mobile. The team at DIAG uses the 12 Factor App methodology, has robust CI/CD pipelines that distribute the application as a set of Docker images, provision bare metal and VM instances with Ansible, and used Minio to abstract the on-site storage with the Amazon S3 API. Using AWS services, DIAG was able to rapidly move this workload to the cloud. Several terabytes of imaging data were synced in place using Amazon S3 sync, so that switching over the storage backend was a case of changing environment variables. The team was also able to move the database to a managed Postgres RDS instance and the Celery broker from Redis to Amazon SQS to reduce the ops burden. Moving this workload to AWS has allowed for scale on-demand, based on the unpredictable demand during this pandemic. Research results and future work Recently, the group at Radboud University Medical Center together with numerous collaborators published the results of the CORADS-AI, a system that consists of three deep learning algorithms that automatically segment the five pulmonary lobes, assign a CO-RADS score for the suspicion of COVID-19 and assign a CT severity score for the degree of parenchymal involvement per lobe.(2) The system was tested on 105 patients (62 ± 16 years, 61 men) and 262 patients (64 ± 16 years, 154 men) internal and the external cohorts, respectively. The system discriminated between COVID-19 positive and negative patients with areas under the ROC curve of 0.95 (95% CI: 0.91-0.98) and 0.88 (95% CI: 0.84-0.93). CORADS-AI has been deployed on the AWS platform and it is now available for other researchers. One of the future goals of grand-challenge.org is to allow users to submit their custom algorithms to run in GPU accelerated hardware in the cloud. This code needs to run in an isolated, secure environment, and users should preferably be able to either use Docker images provided by grand-challenge.org, or their own images. A solution is currently being developed that makes use of AWS Batch as a job scheduler. A web application will let the users interact with a fully serverless backend built on top of AWS Lambda, Amazon DynamoDB, and Amazon API Gateway, to enable them to submit their jobs and manage their results. Amazon ECR will store container images. A CI/CD pipeline built on top of the AWS CodePipeline service will follow, that will allow users to submit their Dockerfile to have them automatically built and stored in Amazon ECR.

Bram van Ginneken Bram van Ginneken PhD, is Professor of Medical Image Analysis at Radboud University Medical Center and chairs the Diagnostic Image Analysis Group. He also works for Fraunhofer MEVIS in Bremen, Germany, and is a founder of Thirona, a company that develops software and provides services for medical image analysis. He studied Physics at Eindhoven University of Technology and Utrecht University. In 2001, he obtained his Ph.D. at the Image Sciences Institute on Computer-Aided Diagnosis in Chest Radiography. He has (co-)authored over 200 publications in international journals. He is a member of the Fleischner Society and of the Editorial Board of Medical Image Analysis. He pioneered the concept of challenges in medical image analysis.

Razvan Ionasec Razvan Ionasec, PhD, MBA, is the technical leader for healthcare at Amazon Web Services in Europe, Middle East, and Africa. His work focuses on helping healthcare customers solve business problems by leveraging technology. Previously, Razvan was the global head of artificial intelligence products at Siemens Healthineers in charge of AI-Rad Companion, the family of AI-powered and cloud-based digital health solutions for imaging. He holds 30+ patents in AI/ML for medical imaging and has published 70+ international peer-reviewed technical and clinical publications on computer vision, computational modelling, and medical image analysis. Razvan received his PhD in Computer Science from the Technical University Munich and MBA from University of Cambridge, Judge Business School.

James Meakin James Meakin, PhD, leads the Research Software Engineering team at the RadboudUMC Technology Center for Deep Learning where we develop software to accelerate the translation of imaging research to the clinic. James is also a core team member of NL-RSE to promote RSE careers and practices in the Netherlands. Before joining RadboudUMC, James obtained a DPhil (PhD) in Medical Imaging from the University of Oxford and worked at Philips Healthcare.

Nuñez Pölcher Pablo Nuñez Pölcher, MSc, is a Solutions Architect working for the Public Sector team at Amazon Web Services, based in Madrid. Pablo focuses on helping public sector customers build new, innovative products on AWS in accordance with best practices. Pablo received his M.Sc. in Biological Sciences from Universidad de Buenos Aires.

Comments

View Comments

Resources

AWS for Industry AWS Events AWS Training & Certification AWS Whitepapers AWS Compliance Reports

Twitter Facebook LinkedIn Twitch Email Updates

Learn About AWS

What Is AWS? What Is Cloud Computing? AWS Accessibility AWS Inclusion, Diversity & Equity What Is DevOps? What Is a Container? What Is a Data Lake? What is Generative AI? AWS Cloud Security What’s New Blogs Press Releases

Resources for AWS

Getting Started Training and Certification AWS Solutions Library Architecture Center Product and Technical FAQs Analyst Reports AWS Partners

Developers on AWS

Developer Center SDKs & Tools .NET on AWS Python on AWS Java on AWS PHP on AWS JavaScript on AWS

Help

Create an AWS Account

Amazon is an Equal Opportunity Employer: Minority / Women / Disability / Veteran / Gender Identity / Sexual Orientation / Age.

Language عربي Bahasa Indonesia Deutsch English Español Français Italiano Português Tiếng Việt Türkçe Ρусский ไทย 日本語 한국어 中文 (简体) 中文 (繁體)

Digital Brain

How to build a global, scalable, low-latency, and secure machine learning medical imaging analysis platform on AWS

Table of Contents

Summary

Data Points

Recent Notes

AWS for Health Day Zurich - 2025 - Zurich, Sep 30th, 2025

automatica 2025 - Munich, June 25th, 2025

AWS GenAI for Health Summit 2025 - Lisbon, June 18th, 2025

DMEA - Beyond Proof of Concept- Realizing AI's Transformational Potential in Healthcare - Berlin, April 9th, 2025

About