Uncovering the structure of heterogenous biological data: fuzzy graph partitioning in the k-partite setting
With the increasing availability of large-scale interaction networks derived either from experimental data or from text mining, we face the challenge of interpreting and analyzing these data sets in a comprehensive fashion. A particularity of these networks, which sets it apart from other examples in various scientific fields lies in their k-partiteness. Whereas graph partitioning has received considerable attention, only few researchers have focused on this generalized situation. Recently, Long et al. have proposed a method for jointly clustering such a network and at the same time estimating a weighted graph connecting the clusters thereby allowing simple interpretation of the resulting clustering structure. In this contribution, we extend this work by allowing fuzzy clusters for each node type. We propose an extended cost function for partitioning that allows for overlapping clusters. Our main contribution lies in the novel efficient minimization procedure, mimicking the multiplicative update rules employed in algorithms for non-negative matrix factorization. Results on clustering a manually annotated bipartite gene-complex graph show significantly higher homogeneity between gene and corresponding complex clusters than expected by chance. The algorithm is freely available at http://cmb.helmholtz-muenchen.de/ fuzzyclustering.
Full Text: PDF