Integrated Simultaneous Analysis of Different Biomedical Data Types with Exact Weighted Bi-cluster Editing
The explosion of biological data has largely influenced the focus of todays biology research. Integrating and analysing large quantity of data to provide meaningful insights has become the main challenge to biologists and bioinformaticians. One major problem is the combined data analysis of data from different types, such as phenotypes and genotypes. This data is modelled as bi-partite graphs where nodes correspond to the different data points, mutations and diseases for instance, and weighted edges relate to associations between them. Biclustering is a special case of clustering designed for partitioning two different types of data simultaneously. We present a bi-clustering approach that solves the NP-hard weighted bi-cluster editing problem by transforming a given bi-partite graph into a disjoint union of bi-cliques. Here we contribute with an exact algorithm that is based on fixedparameter tractability. We evaluated its performance on artificial graphs first. Afterwards we exemplarily applied our Java implementation to data of genomewide association studies (GWAS) data aiming for discovering new, previously unobserved geno-to-pheno associations. We believe that our results will serve as guidelines for further wet lab investigations. Generally our software can be applied to any kind of data that can be modelled as bi-partite graphs. To our knowledge it is the fastest exact method for weighted bi-cluster editing problem.
Peng Sun Jiong Guo Jan Baumbach
Computational Systems Biology group, Max Planck Institute for Informatics, Campus E1. 4,66123 Saarbr Cluster of Excellence for Multimodal Computing and Interaction, Saarland University,Campus E1.7, 661 Computational Systems Biology group, Max Planck Institute for Informatics, Campus E1. 4,66123 Saarbr
国际会议
杭州
英文
28-39
2012-04-02(万方平台首次上网日期,不代表论文的发表时间)