Open Access Open Access  Restricted Access Subscription Access


Kathryn Dempsey, Vladimir Ufimtsev, Sanjukta Bhowmick, Hesham Ali


High throughput biological experiments are critical for their role in systems biology – the ability to survey the state of cellular mechanisms on the broad scale opens possibilities for the scientific researcher to understand how multiple components come together, and what goes wrong in disease states. However, the data returned from these experiments is massive and heterogeneous, and requires intuitive and clever computational algorithms for analysis. The correlation network model has been proposed as a tool for modeling and analysis of this high throughput data; structures within the model identified by graph theory have been found to represent key players in major cellular pathways. Previous work has found that network filtering using graph theoretic structural concepts can reduce noise and strengthen biological signals in these networks. However, the process of filtering biological network using such filters is computationally intensive and the filtered networks remain large. In this research, we develop a parallel template for these network filters to improve runtime, and use this high performance environment to show that parallelization does not affect network structure or biological function of that structure.


High performance computing; correlation networks; parallel computing; network filters; graph algorithms; noise; biological signal.

Full Text:



A. L. Barabasi, & Z. N. Oltvai. Network biology: Understanding the cell's functional organization, Nature Reviews Genetics, (5) 2 (2004), pp. 101-113.

T. Barrett, S. E. Wilhite, P. Ledoux, et al. NCBI GEO: archive for functional genomics data sets – update 2013, Nucleic Acids Research, (41(D1), (2013), pp. D991-D995.

O. Frings, A. Alexeyenko, and E. L. Sonnhammer, MGclus: network clustering employing shared neighbors, Molecular Biosystyems, (9) 7 (July 2013), pp. 1670-1675.

J. Reichardt, Structure in Complex Networks, Lecture Notes in Physics, Vol. 766, Springer, Berlin, 2009.

E. Reiter, Q. Jiang, and S. Christen, Anti-inflammatory properties of alpha- and gamma-tocopherol, Molecular Aspects in Medicine, (28) 5-6 (October-December 2007), pp. 668-691.

What is Big Data | Big Data Explained, Villanova University, n.p. n.d., January 5, 2014

S. West, K. Dempsey, S. Bhowmick, and H. Ali. Analysis of incrementally generated clusters in biological networks using graph-theoretic filters and ontology enrichment, IEEE International Conference on Data Mining (ICDM 2014), Dallas, Texas, USA, December 7-10, 2013.

G. D. Bader, and C. H. W. Hogue. An automated method for finding molecular complexes in large protein interaction networks, BMC Bioinformatics, (4) 2 (January 2003), pp. 2.

C. J. Bult, J. T. Eppig, J. A. Kadin, J. E. Richardson, J. A. Blake et al. The Mouse Genome Database (MGD): mouse biology and model systems, Nucleic Acids Ressearch, (36) Database Issue (January 2003), pp. D724-D728.

K. Dempsey, K. Duraisamy, H. Ali, & S. Bhowmick. A parallel graph sampling algorithm for analyzing gene correlation networks, Proceedings of the 2011 International Conference on Computational Science, Vol. 4, 2011, pp. 136-145.

K. Dempsey, K. Duraisamy, S. Bhowmick, and H. Ali. The development of parallel adaptive sampling algorithms for analyzing biological networks, Proceedings of the 11th IEEE International Workshop on High Performance Computational Biology (HiCOMB 2012), May 21, 2012,

K. Dempsey, S. Bhowmick, and H. Ali. Function-preserving filters for sampling in biological networks, Proceedings of the 2012 International Conference on Computational Science, Vol. (9), 2012, pp. 587-595.

K. Dempsey, and H. Ali. On the discovery of cellular subsystems in correlation networks using centrality measures, Current Bioinformatics, (7) 4 (2012), publication pending.

K. Duraisamy, K. Dempsey, H. Ali, and S. Bhowmick. A noise reducing sampling approach for uncovering critical properties in large scale biological networks, Proceedings of the High Performance Computing and Simulation International Conference (HPCS), July 4-8, 2011, pp. 721-728.

J. Dong, & S. Horvath, Understanding network concepts in modules, BMC Systems Biology, (1) 24 (June 1, 2007).

W. J. Ewens, & G. R. Grant, Statistical Methods in Bioinformatics, 2nd edition, New York, NY: Springer, 2005.

R. Edgar, M. Domrachev, and A. E. Lash. Gene expression omnibus: NCBI gene expression and hybridization array data repository, Nucleic Acids Research, (30) 1 (2002), pp. 207-210.

A. J. Enright, S. Van Dongen, C. A. Ouzounis. An efficient algorithm for large-scale detection of protein families, Nucleic Acids Research, (30) 7 (2002), pp. 1575-1584.

H. Jeong, S. P. Mason, A. L. Barabasi, & Z. N. Oltvai. Lethality and centrality in protein networks, Nature, (411) 6833 (2001), pp. 41-42.

R. Linkser, Self-organization in a perceptual network, Computer, (21) 3 (1998), pp. 105-117.

M. E. J. Newman. Assortative mixing in networks, Physical Review Letters, (89) 20 (October 2002), pp. 208701.

R. Opgen-Rhein, & K. Strimmer. From correlation to causation networks: A simple approximate learning algorithm and its application to high-dimensional plant gene expression data, BMC Systems Biology, (1) 37 (August 2007).

M. Verbitsky, A. L. Yonan, G. Malleret, E. R. Kandel, T. C. Gilliam, & P. Pavlidis. Altered hippocampal transcript profile accompanies an age-related spatial memory deficit in mice, Learning & Memory, (11) 3 (2004), pp. 253-260.

A. Subramanian, P. Tamayo, V. K. Mootha, S. Mukherjee, B. L. Ebert, M. A. Gilette, A. Paulovich, S. L. Pomeroy, T. R. Golub, E. S. Lander, and J. P. Mesirov. Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wise expression profiles, Proceeding of the National Academy of Sciences, (102) 43 (2005), pp. 15545-15550.

D. Gleich. Gaimc: Graph Algorithms in Matlab Code. 16 May 2009. Obtained on 01.17.2012, from

A. Keller, C. Backes, M. Al-Adwahi, A. Gerasch, J. Keuntzerm, O. Kohlbacher, M. Kaufmann, and H. P. Lenhof. GeneTrailExpress: a web-based pipeline for the statistical evaluation of microarray experiments, BMC Bioinformatics, (9) 552 (December 2008).

R. C. Prim. Shortest connection networks and some generalizations, Bell System Technical Journal, (36) (1957), pp. 1389–1401.

P. M. Dearing, D. R. Shier, and D. D. Warner, Maximal chordal subgraphs, Discrete Applied Mathematics, (20) 3 (1988), pp. 181-190.


  • There are currently no refbacks.
hgs yükleme