You are here
Ontologies and Data Integration in Biomedicine: Success Stories and Challenging Issues
The promise of translational medicine, hinges upon bridging basic research and clinical practice. One key element to the integration of the research and clinical communities is the integration of the information sources and data used in these communities. In practice, bridges need to be created both across domains (e.g., between genotypic and phenotypic information sources) and across knowledge bases within a domain (e.g., between genomic and pathway resources). Biomedical ontologies play an important role in data integration. They support data integration in two different ways, corresponding to two different approaches to data integration: warehousing and mediation. One the one hand, by providing a controlled vocabulary in a given domain, ontology support the standardization required from warehousing approaches to data integration, in which the sources to be integrated are transformed into a common format and converted to a common vocabulary. On the other hand, mediation-based approaches use ontologies for defining a global schema (in reference to which queries are made) and mapping between the global schema and local schemas (the schemas of the sources to be integrated). We review examples in which ontologies have been used successfully for integrating biomedical data, including the integration of genomic data based on Gene Ontology annotations, the cancer Biomedical Informatics Grid (caBIG) project, and semantic mashups created by the Semantic Web for Health Care and Life Sciences community. Barriers to integration are discussed next.