Signed Network Propagation for Detecting Differential Gene Expressions and DNA Copy Number Variations

Wei Zhang1, Nicholas Johnson1, Baolin Wu2 and Rui Kuang1

1. Department of Computer Science and Engineering, University of Minnesota Twin Cities
2. Division of Biostatistics, School of Public Health, University of Minnesota Twin Cities


Network propagation algorithms have proved useful for the analysis of high-dimensional genomic data. One limitation is that the current formulation only allows network propagation on positively weighted graphs. In this paper, we explore two signed network propagation algorithms and general optimization frameworks for detecting differential gene expressions and DNA copy number variations (CNV). The proposed algorithms consider both positive and negative relations in graphs to model gene up/down-regulation or amplification/deletion CNV events. The first algorithm (Signed-NP) integrates gene co-expressions and differential expressions for consistent and robust gene selection from microarray datasets by propagation on gene correlation graphs. The second algorithm (Signed-NPBi) identifies gene or CNV markers by propagation on sample-feature bipartite graphs to capture bi-clusters between samples and genomic features. Large scale experiments on several microarray gene expression datasets and CNV datasets validate that Signed-NP and Signed-NPBi perform better classification of gene expression and CNV data than standard network propagation. The experiments also demonstrate that Signed-NP is capable of selecting genes that are more biologically interpretable and consistent across multiple datasets, and Signed-NPBi can detect hidden CNV patterns in bi-clusters by smoothing on correlations between adjacent probes.



Funding: NSF-III1117153: Small: Network Learning for Integrative Cancer Genomics; NIH grants GM083345, CA134848.

Citation: Signed Network Propagation for Detecting Differential Gene Expressions and DNA Copy Number Variations, Wei Zhang, Nicholas Johnson, Baolin Wu, Rui Kuang. Proc. of ACM Conference on Bioinformatics, Computational Biology and Biomedicine (ACM BCB), Oct 2012.

Source Code [Matlab code]

Datasets[5 breast cancer datasets]