Bridging Research and Clinical Bioinformatics

Research bioinformatics need tweaks for the clinic.

February 9, 2011 | Guest Commentary | It will soon be cheaper to sequence a patient’s entire genome and use software filters to implement genetic testing than to conduct multiple separate tests for specific genes. Notwithstanding unresolved regulatory, legal, reimbursement, physician training, and privacy issues related to whole-genome testing, there remain big bioinformatics issues as well.

The recent crackdown by the FDA on the personal genetic testing services is a case in point: if the results of genome-wide genotyping from different labs for the same patient conflict, that could undermine trust in whole-genome testing and hinder adoption. Getting this right includes solving a number of bioinformatics issues. For their part, bioinformatics providers offer tools to help researchers make discoveries. Since the research market for whole genome analysis is still forming it’s easy to see why the consumer portals are under fire: more research is needed.

Leading research hospitals and clinical testing labs are actively implementing workflows to evaluate how and when to deploy whole genome tests for their patients. But how do the needs for clinical bioinformatics differ from research bioinformatics? And will the clinical testing market adopt existing research platforms or develop its own solutions?

Research vs. Clinical Needs

A researcher is dedicated to discovery whereas clinicians need efficient and repeatable processes. But some of the assets in bioinformatics tools for researchers may be drawbacks in the clinical setting. While the basic next generation re-sequencing data processing pipeline—assemble reads to a reference, call variations, annotate and produce reports for interpretation—might look the same, there are subtle but important differences. For example, a software tool to process a whole genome that does not afford researchers flexibility to tune the algorithms would be seen as a black box and find little market share. But that same software’s inflexibility might find favor for clinical use because of its ability to conform to the established guidelines.

There are presently no established guidelines for whole genome re-sequencing. Labs are left to develop their own methods to decide when a variation is real, or what coverage is needed to distinguish different alleles. This leaves the possibility that different labs could call variations differently for the same patient. The problem is compounded when you consider different instrument platforms, aligners, variant callers, and databases of reference annotation.

Another looming issue is the lack of a consistent, sanctioned database of variant annotations. Again, differences in research versus clinical requirements are subtle. Researchers extend annotation of variations with any and all sources regardless of quality, in hopes of finding the “needle in the haystack” causative variant. By contrast, clinical use should report only “what is known to be causative” so that reports that can be interpreted by doctors and geneticists.

Solving the problem is complicated; there are many sources of variant annotation: commercial databases such as HGMD and SNPedia; OMIM has many well-annotated entries; the Human Genome Variation Society maintains locus-specific databases (LSDB) on the Web; NCBI and EBI are collaborating to provide a stable locus reference genome (LRG) and annotating dbSNP entries from known clinical sources (ClinVar); PharmGKB is a popular research database of drug-gene interactions; research institutes maintain private archives of variant annotation, as do clinical testing labs. Without a single variant annotation source, labs are duplicating IT efforts and increasing the potential for inconsistent annotation of patient samples.

Reproducibility in the Clinic

Knowledge about a variation changes rapidly with new research. Research and clinical labs need a way to update and re-annotate genomes with new findings from the research world. Some have suggested a clearinghouse to share information on novel variations. As you might expect, they are uninterested in sharing details of patient traits, even in a de-identified manner.

Lab end-users need streamlined research workflows. Like their research colleagues, lab professionals are sometimes faced with determining whether a variation, never before seen, has the potential to cause the patient harm. For this use case a lab professional would like to have all the resources one would commonly associate with research: genome visualization, links to protein function, and pathway knowledge. While a researcher needs systems to experiment further, a lab professional needs concise reports so they can convey information to doctors. The clinical market will need standards and guidelines for classification and reporting of novel variants such as either pathogenic or benign, for example.

Research bioinformatics tools will not find a big clinical market without some changes. However, there is no reason that the best research bioinformatics cannot be made compliant with standards and streamlined for use by a community of users who cares little about how the software works, only that it does work—reproducibly.

The issues of standardization and methods are of great interest to a number of industry groups forming. Look for the forthcoming white paper from the Banbury Center pathology summit meeting (October 2010) and the proceedings from the Clinical Bioinformatics Summit hosted by Harvard Medical School in December.


Ron Ranauro, former CEO of GenomeQuest, is the founder of Next-Gen Informatics. Email: ron@nextgeninformatics.com

This article also appeared in the January-February 2011 issue of Bio-IT World Magazine. Subscriptions are free for qualifying individuals. Apply today.