ThoughtSphere Builds A Data Lake For Clinical Trials

By Allison Proffitt 

August 2, 2017 | Our old ways of connecting to data are dead, Sudeep Pattnaik told me over lunch in Miami last January. We need a new way of connecting.

The key question when managing data volume, is “How do I act in a timely way?” Pattnaik explained when we met at the Summit for Clinical Ops Executives (SCOPE). It’s a universal question, but Pattnaik and another Quintiles alumnus, Pankaj Manon, wanted to create a solution specifically for clinical research.

The two co-founded ThoughtSphere. Pattnaik serves as CEO; Manon is CTO. They formally launched the company about a year ago.

At its heart, ThoughtSphere is a data lake, an architecture meant to collect vast pools of data from varied sources, and it’s licensed as software-as-a-service. It’s a data management model favored by companies like Google and Facebook. ThoughtSphere, though, is focusing on clinical research data.

Pattnaik has years of experience working with clinical data, and he saw plenty of pharmas and CROs using different vendors for different parts of the pipeline. The volume and variety of clinical study data that needs to be managed by sponsors, sites, and CRO—both structured and unstructured—makes a data lake an ideal model, Pattnaik believes.

“The biggest challenge is when you spend lots of money to keep up to date with everything that’s going on,” he said. “For example, all the data entry systems—EDC, CTMS, etc.—there are various vendors involved. The Medidatas of the world, Oracles of the world, etc. None of [their data outputs] play nice.”

ThoughtSphere supports source-system agnostic data ingest, obviating cumbersome programming for each eClinical system. Sponsors and CROs can get the benefits of accessing all clinical data regardless of its format, including CTMS, lab, imaging,           and safety data in near real-time.

ThoughtSphere doesn’t transform the data when it comes into the lake, transformation happens for specific use-cases and according to usage pattern. The ThoughtSphere deep learning engine watches how users use their data and adjusts accordingly. “We let customers define their own transformation,” Pattnaik said.

“While ingesting the data… we do not have pre-defined way to connect things,” he said. “You know the way people used to connect through data warehouses? Where you build a large, pre-defined structure then bring the data from various sources?... We do not do that in the data lake technology. We bring data as-is. We do not connect it.”

Instead, ThoughtSphere’s mapping algorithm engine runs on top of the data lake, recording how users are using the data, and then making predictions for new incoming data. Ultimately, what everyone is after is to have a standardized structure so they do not have to do study-specific analytics and visualization, Pattnaik argues. For that you need a standardized dataset.

“What our engine does is convert the data on the fly to a standardized dataset,” Pattnaik explains. “As a customer, you can define your own standard; you can convert the data into that.” It doesn’t take long for the ThoughtSphere system to start doing most of the heavy lifting for you, Pattnaik said. “The more you use the system, after the third or fourth study in the system, almost 80-90% of things come automatically mapped and standardized.”

Product Suite

Three products currently make up the ThoughtSphere suite. ClinDAP is the underlying platform. It can be independently deployed in enterprise fashion for data visualization. ClinDAP’s data visualization tools can be used by sponsors, sites, and CROs. Pulling from all of the data in the lake, ClinDAP gives sponsors insight into how their CROs are performing, and it empowers sites and CROs to track their milestones without any data aggregation steps.

ClinACT is ThoughtSphere’s risk-based monitoring tool that supports RBM analytics and

visualizations. The tool has two modules: Analytics only or the full-function Risk module. The full tool helps with planning, monitoring strategy, source data verification and review, assignment of clinical activities, adjust key risk indicators during a study, set threshold-based alerts based on those risk indicators, and integrate subject data cleanliness indicators and holistic subject review.

ThoughtSphere has recently added trigger functionality to both ClinDAP and ClinACT alerting users to medical issues in lab data or study-specific risk triggers users can set, as well as notification enhancements for central monitoring to speed communication, query resolution, and issue management between CRA and site.

SPACE is the company’s Site Payment And Contracting Environment, its financial and trial life cycle management tool. SPACE handles site payment, contracting, and budgets. For payments, data originates with the sites and flows into ClinACT. Sponsors define payment milestones, and—for sites that are submitting data of the highest quality—they’ll get near real-time payment, Pattnaik said.

Sponsors can launch a ThoughtSphere product by study or across the organization’s portfolio. An advanced visualization tool is in the works; Pattnaik says a platform-agnostic visualization and analytics tool should be available in the fall.

ThoughtSphere has announced a few customers publicly and is working with several others, Pattnaik said. “Customers are seeing the benefits!”