AI in Imaging: A New Focus on Achieving Precision Medicine More Effectively and Less Invasively

Contributed Commentary by Esteban Rubens

October 24, 2019 | Precision medicine has long been an elusive, yet extraordinarily important, quest for the healthcare community. We are, at long last, making steady and measurable progress. In cancer care, it’s becoming the norm to analyze tumors for known gene mutations or expressions to select treatments likely to be most effective. Even so, the application of personalized medicine, specifically in cancer care, can be expensive, invasive, and time intensive as tumors are analyzed through traditional pathology and genomics.

AI-enabled imaging presents a powerful opportunity to accelerate the identification and application of personalized treatments in ways that are often less invasive, faster, and potentially more cost effective.

In the last few years, we’ve started to see several promising applications of AI in imaging to support precision medicine initiatives. A study published in the April 2019 issue of the Journal of Neuro-Oncology shows strong potential for using machine-learning algorithms to reveal multi-modal MRI patterns to accurately and rapidly predict the presence of genotypes and mutations in glioma—specifically Isocitrate dehydrogenase (IDH) and 1p19q codeletion status—which are good predictors of treatment efficacy. The authors write:

“Although biopsies can be performed at relatively low-risk, an approach using MRI imaging to predict IDH and 1p19q genotype preoperatively is a less expensive and noninvasive alternative. The early identification of IDH and 1p19q status may benefit the prediction of patient’s prognosis and predictive of responsiveness to chemotherapy and radiation. Using machine-learning algorithms, high accuracy was achieved in the prediction of IDH genotype in gliomas and moderate accuracy in a three-group prediction including IDH genotype and 1p19q codeletion.”

This is just one example, and the potential for others is limitless. But, how can we, as the healthcare community, accelerate the use of AI in imaging to advance personalized medicine?

The key lies in understanding the value of imaging data as an asset and better democratizing it across organizations and the broader healthcare ecosystem. This entails data sharing as well as local training for algorithms. The equation for improved model inference accuracy—which will drive expanded application of AI in precision medicine—combines vast amounts of annotated data plus multiple training runs.

AI and training algorithms must not just be relegated to the large research institutions—healthcare organizations of diverse types and sizes can and must play a role if we are to move faster. Healthcare organizations hold tremendous troves of data specific to local populations, which we are learning is crucial to greater algorithm accuracy. It is increasingly clear that not all models trained out of the box do well with data from vastly different populations—utilization of data related to local patients and protocols is essential for fine tuning.

In a December 2018 PLOS Medicine Journal article, “Better Medicine Through Machine Learning: What's Real, and What's Artificial?", the authors reinforce the notion of training models locally for the most precise results. They write: “‘Irrational extrapolation’—the assumption that algorithms trained on an easy-to-obtain set of patients or data will lead to accurate models that act in each patient’s best interest—must be stringently avoided until algorithms can correct for such biases and use clinical data to reason about disease severity and trajectory.”

Start With the Right Foundation

When beginning the journey to mainstream AI in imaging from the lab to the bedside and integrating it into existing workflows, organizations must start with their infrastructure—and an understanding that what may have worked yesterday may not (and likely will not) work tomorrow.

Several factors are at play. As noted previously, local training of algorithms is essential to leveraging AI at scale in clinical settings. As such, healthcare organizations require an infrastructure that can support this training. When introducing AI models into clinical practice, organizations have to integrate them into existing workflows—and need to do so without introducing new latency to existing workflows and clinical apps. Most organizations, however, encounter challenges on these fronts as they continue to operate in a highly siloed environment – with separate systems for clinical data (such as PACS and vendor-neutral archives) and research initiatives. Moving data between these silos is slow, expensive, and tedious. Organizations have to move data again and again, resulting in multiple copies of data, greater complexity, higher risk, elevated costs—and less accurate and usable data at the enterprise level.

To truly democratize AI and effectively train models locally to avoid irrational exploitation—healthcare organizations need to start with a new foundation, one that focuses on data and places it at the center of everything—a data-centric architecture.

A data-centric architecture strategy, which consolidates islands and silos of data infrastructure, and ultimately simplifies the data foundation, is defined by five key attributes:

  • Real-time: It supports the capability to find the right insight at the right time to drive improved clinical and operational outcomes.
  • On-Demand and Self-Driving: It prioritizes automation at its core and leverages machine learning to provide high levels of availability and proactive support. A data-centric architecture should be easy to provision and evolve with your needs.
  • Exceptionally reliable and secure: This is a must, especially when it comes to critical patient data and protected health information.
  • Support for multi-cloud environments: It should easily allow storage volumes to be moved to and from the cloud, and between cloud providers, making application and data migration simple, and enabling hybrid use cases for application development, deployment, and protection. Many organizations have become comfortable with the concept of private cloud either within their datacenter or through remote hosting agreements. Some are moving workloads to the public cloud. A data-centric architecture should support the flexibility to take advantage of the cloud when and how an organization chooses.
  • Constantly evolving and improving: Users should expect that their IT infrastructure continuously gets better, without downtime, delivering more value every year for the same or lower cost. Healthcare organizations should expect the same for their storage infrastructure. They must architect for constant improvement so that storage services can be seamlessly improved, without ever bringing applications or users offline.

At the Heart of It All

The data-centric architecture requires a new type of data hub, one that allows organizations to consolidate all applications on a single storage platform to unify and share data across the applications that need them for better insight. It must be intended to share and deliver data within an organization for modern analytics and AI so patients and clinicians can benefit from the insights the data hold, not merely as a cold-data repository.

Because organizations are looking to preserve existing infrastructure investment and reduce risk, the hub should allow organizations to share their data across data teams and applications, taking the key strengths of each silo and the unique features that make them capable for their own tasks, and integrating them into a single unified platform.

A data hub must have four qualities, which are essential to unifying data: high throughput, file and object; native scale-out; multi-dimensional performance; and massively parallel architecture that mimics the structure of GPUs to deliver performance to tens of thousands of cores accessing billions of objects. A data hub may have other features, like snapshots and replication, but if any of the four features are missing from a storage platform, it is not a data hub. For example, if a storage system delivers high throughput file and is natively scale-out, but needs another system with S3 object support for cloud-native workloads, then the unification of data is broken, and the velocity of data is crippled.

In healthcare, like other industries, the ability to support objects is increasingly important as next-generation engineers are coding in a cloud world, where objects enable greater simplicity and flexibility. A data hub must enable high-performance local object storage so that organizations can bring it on-premises without compromising performance.

A data-centric architecture—powered by a data hub that possesses all four of the necessary qualities—is integral for healthcare organizations, large and small, that are looking to optimize the use of AI-enabled imaging in personalized medicine. This architecture ensures that the data at the heart of this opportunity is truly democratized and enables the effective training of local models that can be used to improve algorithm accuracy. While AI-enabled imaging will be a driving force behind precision medicine that is less invasive and potentially more cost effective, the data is the engine that will continue to power these advances, with the proper architecture being crucial.

Esteban Rubens serves as the Global Principal for Enterprise Imaging at Pure Storage where he is responsible for Pure's solutions, strategy, market development and thought leadership in that area of Healthcare. Esteban has 20 years of experience in the storage industry, and over 13 years of experience working in the healthcare technology sector. He held several roles at FUJIFILM Medical Systems, Sandial Systems, Platypus Technology and other storage companies. He can be reached at esteban@purestorage.com.