You are here
Gene Ontology-Driven Similarity for Gene Expression Correlation Analysis
The Gene Ontology (GO) and its related annotation databases provide information for measuring similarity between gene products. The topological features of GO terms (i.e., their inter-relationships in the ontology) and the statistical features of the terms in annotation databases (i.e., frequency) are both exploited by information-theoretic approaches to measuring functional similarity among gene products. Previous research has shown that GO-driven, functional similarity of pairs of genes correlates with sequence similarity. This study aims to support the integration of GOdriven similarity for functional prediction problems. It focuses on the quantitative assessment of relationships between GO-driven similarity and expression correlation. It also offers insights into the consistency of the functional information represented in the GO and resulting databases. The GO and annotations derived from the S. cerivisiae Genome Database (SGD) were analyzed to calculate functional similarity of gene products. Three methods for measuring similarity were implemented: Resnik's, Lin's and Jiang's metrics. Using a known gene expression dataset in yeast, several million pairs of gene products were compared on the basis of these properties. This analysis was performed separately on the three hierarchies of the GO. It confirms that highly correlated genes exhibit strong similarity based on information originating from the GO hierarchies. Such a similarity is significantly stronger than that observed between weakly correlated genes. This observation holds for the three GO hierarchies and for the three metrics under investigation.