CoGA: An R Package to Identify Differentially Co-Expressed Gene Sets by Analyzing the Graph Spectra

Gene set analysis aims to identify predefined sets of functionally related genes that are differentially expressed between two conditions. Although gene set analysis has been very successful, by incorporating biological knowledge about the gene sets and enhancing statistical power over gene-by-gene analyses, it does not take into account the correlation (association) structure among the genes. In this work, we present CoGA (Co-expression Graph Analyzer), an R package for the identification of groups of differentially associated genes between two phenotypes. The analysis is based on concepts of Information Theory applied to the spectral distributions of the gene co-expression graphs, such as the spectral entropy to measure the randomness of a graph structure and the Jensen-Shannon divergence to discriminate classes of graphs. The package also includes common measures to compare gene co-expression networks in terms of their structural properties, such as centrality, degree distribution, shortest path length, and clustering coefficient. Besides the structural analyses, CoGA also includes graphical interfaces for visual inspection of the networks, ranking of genes according to their ""importance"" in the network, and the standard differential expression analysis. We show by both simulation experiments and analyses of real data that the statistical tests performed by CoGA indeed control the rate of false positives and is able to identify differentially co-expressed genes that other methods failed.

URI

https://observatorio.fm.usp.br/handle/OPI/11705

Referências

Amar D, 2013, PLOS COMPUT BIOL, V9, DOI 10.1371/journal.pcbi.1002955
Barabasi AL, 2004, NAT REV GENET, V5, P101, DOI 10.1038/nrg1272
BENJAMINI Y, 1995, J ROY STAT SOC B MET, V57, P289
Chan WY, 2000, AM J PATHOL, V156, P409, DOI 10.1016/S0002-9440(10)64744-X
Choi Y, 2009, BIOINFORMATICS, V25, P2780, DOI 10.1093/bioinformatics/btp502
Dai MH, 2005, NUCLEIC ACIDS RES, V33, DOI 10.1093/nar/gni179
de la Fuente A, 2010, TRENDS GENET, V26, P326, DOI 10.1016/j.tig.2010.05.001
Huber RM, 2013, PLOS ONE, V8, DOI 10.1371/journal.pone.0057793
Hudson NJ, 2009, PLOS COMPUT BIOL, V5, DOI 10.1371/journal.pcbi.1000382
Irizarry RA, 2003, BIOSTATISTICS, V4, P249, DOI 10.1093/biostatistics/4.2.249
Kato K, 2003, INT J GYNECOL PATHOL, V22, P334, DOI 10.1097/01.pgp.000092129.10100.5e
Keller MP, 2008, GENOME RES, V18, P706, DOI 10.1101/gr.074914.107
Kendall MG, 1938, BIOMETRIKA, V30, P81, DOI 10.2307/2332226
Langfelder P, 2008, BMC BIOINFORMATICS, V9, DOI 10.1186/1471-2105-9-559
Liu BH, 2010, BIOINFORMATICS, V26, P2637, DOI 10.1093/bioinformatics/btq471
Pearson K, 1920, BIOMETRIKA, V13, P25, DOI 10.2307/2331722
Purow BW, 2005, CANCER RES, V65, P2353, DOI 10.1158/0008-5472.CAN-04-1890
Rahmatallah Y, 2014, BIOINFORMATICS, V30, P360, DOI 10.1093/bioinformatics/btt687
Shannon P, 2003, GENOME RES, V13, P2498, DOI 10.1101/gr.1239303
Silverman BW, 1986, DENSITY ESTIMATION S
Spearman C, 1904, AM J PSYCHOL, V15, P72, DOI 10.2307/1412159
Stockhausen MT, 2010, NEURO-ONCOLOGY, V12, P199, DOI 10.1093/neuonc/nop022
Sturges HA, 1926, J AM STAT ASSOC, V21, P65
Subramanian A, 2005, P NATL ACAD SCI USA, V102, P15545, DOI 10.1073/pnas.0506580102
Takahashi DY, 2012, PLOS ONE, V7, DOI 10.1371/journal.pone.0049949
Tesson BM, 2010, BMC BIOINFORMATICS, V11, DOI 10.1186/1471-2105-11-497
Van Mieghem P, 2010, GRAPH SPECTRA COMPLE
Watson M, 2006, BMC BIOINFORMATICS, V7, DOI 10.1186/1471-2105-7-509
Yang J, 2013, PLOS ONE, V8, DOI 10.1371/journal.pone.0079729
Yu H, 2011, BMC BIOINFORMATICS, V12, DOI 10.1186/1471-2105-12-315
Zhang XH, 2012, CANCER SCI, V103, P181, DOI 10.1111/j.1349-7006.2011.02154.x

Coleções

Artigos e Materiais de Revistas Científicas - FM/MNE
Artigos e Materiais de Revistas Científicas - LIM/15

Página do item completo