Abstract
The complexity of biological systems provides for a great diversity of relationships between genes. The current analysis of whole-genome expression data focuses on relationships based on global correlation over a whole time course, identifying clusters of genes whose expression levels simultaneously rise and fall. There are, of course, other potential relationships between genes, which are missed by global clustering. These include activation, where one expects a time-delay between related expression profiles, and inhibition, where one expects an inverted relationship. Here we propose a new method, which we call local clustering, for identifying these time-delayed and inverted relationships. It is related to conventional gene expression clustering in a fashion formally similar to the way local sequence alignment (Smith-Waterman algorithm) is derived from global alignment (Needleman-Wunsch). We applied our method to the yeast cell-cycle expression dataset and were able to detect a considerable number of additional biological relationships between genes, beyond those resulting from conventional correlation. We related these new relationships between genes to their similarity in function (determined from the MIPS scheme) or their having known protein-protein interactions (determined from the large-scale two-hybrid experiment), finding that genes strongly related by local clustering were considerably more likely than random to have a known interaction or a similar cellular role. This suggests that local clustering may be useful in functional annotation of uncharacterized genes. We examined many of the new relationships in detail. Some of them were already well-documented examples of inhibition or activation, which provide corroboration for our results. For instance, we found an inverted expression profile relationship between genes YME1 and YNT20, where the latter has been experimentally documented as a bypass suppressor of the former. Other relationships were new, often involving uncharacterized yeast genes and thus suggesting functions for many of them. In particular, we found a time-delayed expression relationship between J0544 (which has not yet been functionally characterized) and four genes associated with the mitochondria. This suggests that J0544 may be involved in the control or activation of mitochondrial genes.