University of Chicago
University of Illinois
Last Reviewed: 05/22/2017
The Center for Computational Biotechnology and Genomic Medicine applies expertise in petascale computing, software algorithm optimization, and human genome deep sequencing technologies to transform healthcare with genomic medicine.
The CCBGM will use the power of computational predictive genomics to advance pressing societal issues that require predictive genomics, such as enabling patient-specific cancer treatment, understanding and modifying microbiomes, and supporting humanity’s rapidly expanding need for food by improving the efficiency of plant and animal agriculture. The CCBGM brings together two university sites with unique resources in computational and biological sciences, namely the University of Illinois at Urbana-Champaign (UIUC) and Mayo Clinic; the University of Chicago, working as an affiliate institution, has expertise in high-performance computing (HPC; through Argonne National Laboratory) and cancer genomics.
The CCBGM offers:
The CCBGM’s mission is:
CCBGM Research Thrusts
The set of projects in the three thematic components leverage the multidisciplinary capabilities of the CCBGM team and focus on clinical knowledge in human patients. However, the methods, tools, and algorithms developed as part of these efforts (e.g., microbiome, compression,imaging, and acceleration) apply in the broader context of analyzing the sequence data of crops, animals,and other organisms.
The computing and data management component will focus on innovations in storage and compression technologies for genomic data. Such methods are required to process and understand large-scale bioinformatics problems, including epistatic interactions from genome-wide association studies (GWAS) addressed at scale. Traditionally, studies have focused on individual variants, their expression changes, and associated observable phenotypes. However, with the current availability of computing resources and knowledge extracted from genomics big data, we propose to study epistatic interactions, which are the effects of two or more variants on an observed phenotype. Further, growth in genomics data poses a problem in storage and retrieval, and these challenges are still not well addressed. One need from the scientific community analyzing the data is to avoid losing resolution in the data when the compressed data are decompressed. To that end, we will develop compression algorithms and theory for efficient compression techniques such that the users of genomics data and their analyses will not be affected, as though the data never went through a compression process.
The actionable intelligence component will look at the translation of big data to clinical knowledge. The overarching goal is to enhance patient-specific understanding of disease and tailor diagnosis and individualized treatment. Projects in this thematic component will develop technologies to identify and classify genomic variants, genes, and drivers for human disease. Specifically, we will develop algorithms to help merge heterogeneous datasets (e.g., multi-omics, clinical, and microbiome) and identify statistically significant mutations, genes, metabolites, pathways, and networks that are associated with clinical or functional outcomes. With those patient-specific findings, we can potentially identify drugs that are designed to affect those genes/pathways/metabolites, thereby increasing the chances of a successful treatment and recovery instead of using generic drugs that might not work on specific patients. For example, if a metabolite predictive of depressive disorder (for which there is no known drug) is found, pharmaceutics can investigate related regulatory pathways and potentially innovate a new drug.
The systems innovation component will address the design and implementation of specialized computer systems to efficiently and accurately execute the algorithms for mining actionable intelligence from genomic data. Application-specific computing systems must have the ability to (1) efficiently handle storage and retrieval of large quantities of data produced in sequencing experiments as well as a corpus of medical information that maintains known correlations between genomic variants, genes, pathways, and human diseases; and (2) efficiently compute complex statistical analyses and machine-learning algorithms on parallel-processing platforms such as GPUs and FPGAs, as well as scale out to utilize large warehouse-scale computers (clouds, supercomputers). The projects will explore the design of a common schema for information exchange by closely studying the shared semantics of several annotation datasets (including dbSNP, 1000genomes, or the Human Gene Mutation Database) and their relationships to a variant based on location or ID cross-reference. Our design will also address constant evaluation, monitoring, and quality control of algorithms, workflows, and systems, which will provide the flexibility to incorporate new data, statistical models, and algorithms as they become available.
CCBGM will utilize the advanced research infrastructure existing at the partners – University of Illinois at Urbana-Champaign and the Mayo Clinic. Facilities, laboratories, and resources of the partners are below.
University of Illinois at Urbana-Champaign
Beckman Institute for Advanced Science and Technology
Institute for Genomic Biology
Coordinated Science Laboratory
Sequencing Unit of the Carver Biotechnology Center
Merged computer infrastructure of HPCBio, Institute for Genomic Biology, and Carver Biotechnology Center
Organization of Mayo Medical Center
Center for Individualized Medicine
Next Generation DNA sequencing
Bioinformatics Program and Service Lines
Information Technology Program.
Collectively, these facilities and laboratories provide research opportunities in the multidisciplinary areas of genomics and computing on our three research university campuses. For example, CCBGM offers a structure and forum for multidisciplinary work involving both genomic biologists and computer scientists and engineers from the Center for Individualized Medicine at Mayo, and the Institute for Genomic Biology at Illinois. The strength of our facilities and labs allow the Center to assemble a diverse and complementary set of research partners from the universities who are exceptionally qualified to address big challenges in biology, bioinformatics, and computing as they apply to agriculture, health care, energy, and other critical human issues.