In [1]:
#libraries
library(GO.db)
library(topGO)
library(org.Sc.sgd.db)
library(GOSemSim)
library(gridExtra)


Loading required package: AnnotationDbi
Loading required package: stats4
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: ‘BiocGenerics’

The following objects are masked from ‘package:parallel’:

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from ‘package:stats’:

    IQR, mad, xtabs

The following objects are masked from ‘package:base’:

    anyDuplicated, append, as.data.frame, cbind, colnames, do.call,
    duplicated, eval, evalq, Filter, Find, get, grep, grepl, intersect,
    is.unsorted, lapply, lengths, Map, mapply, match, mget, order,
    paste, pmax, pmax.int, pmin, pmin.int, Position, rank, rbind,
    Reduce, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit, which, which.max, which.min

Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

Loading required package: IRanges
Loading required package: S4Vectors

Attaching package: ‘S4Vectors’

The following objects are masked from ‘package:base’:

    colMeans, colSums, expand.grid, rowMeans, rowSums


Loading required package: graph
Loading required package: SparseM

Attaching package: ‘SparseM’

The following object is masked from ‘package:base’:

    backsolve


groupGOTerms: 	GOBPTerm, GOMFTerm, GOCCTerm environments built.

Attaching package: ‘topGO’

The following object is masked from ‘package:IRanges’:

    members


GOSemSim v2.0.4  For help: https://guangchuangyu.github.io/GOSemSim

If you use GOSemSim in published research, please cite:
Guangchuang Yu, Fei Li, Yide Qin, Xiaochen Bo, Yibo Wu, Shengqi Wang. GOSemSim: an R package for measuring semantic similarity among GO terms and gene products Bioinformatics 2010, 26(7):976-978

Attaching package: ‘gridExtra’

The following object is masked from ‘package:Biobase’:

    combine

The following object is masked from ‘package:BiocGenerics’:

    combine


In [7]:
file <- "yeast_uetz"

ont <- "BP"
init <- 1

db <- org.Sc.sgd.db
mapping <- "org.Sc.sgd.db"
ID <- "ENSEMBL"

scGO <- godata(mapping, ont=ont, keytype=ID)


[1] "preparing gene to GO mapping data..."
[1] "preparing IC data..."

In [12]:
ps <- seq(from = 0.1, to = 0.9, by = 0.1)

In [13]:
ps


  1. 0.1
  2. 0.2
  3. 0.3
  4. 0.4
  5. 0.5
  6. 0.6
  7. 0.7
  8. 0.8
  9. 0.9

In [14]:
getMeanSimilarity <- function(p) {
    ##load all community gene lists
    setwd(sprintf("/home/david/Documents/ghsom/%s_communities_%s_%s", file, p, init))

    #background gene list
    backgroundFilename <- "all_genes.txt"
    allGenes <- scan(backgroundFilename, character())

    #load communities from file
    g <- list()
    numCom <- 0
    filename <- sprintf("community_%s.txt", numCom)
    while (file.exists(filename)) {
        numCom <- numCom + 1
        g[[numCom]] <- scan(filename, character())
        filename <- sprintf("community_%s.txt", numCom)
    }
    
    clusterSims <- sapply(g, function(i) 
        mean(mgeneSim(genes = i, semData = scGO, measure = "Wang", combine = "BMA", verbose = FALSE)))
    
    return(list(numCom, mean(clusterSims)))
}

In [15]:
results <- sapply(ps, getMeanSimilarity)

In [16]:
results


5121138 6 6 6 6 6
0.57426470.38043920.31421260.28385310.26884750.26884750.26884750.26884750.2688475

In [18]:
results[1,]


  1. 51
  2. 21
  3. 13
  4. 8
  5. 6
  6. 6
  7. 6
  8. 6
  9. 6

In [22]:
results[2,]


  1. 0.574264669390298
  2. 0.380439191058087
  3. 0.314212595394822
  4. 0.283853057080131
  5. 0.268847496391923
  6. 0.268847496391923
  7. 0.268847496391923
  8. 0.268847496391923
  9. 0.268847496391923

In [23]:
plot(ps,results[1,],type="l",col="red")
# par(new=TRUE)
plot(ps,results[2,],col="green", add=TRUE)


Warning message in plot.window(...):
“"add" is not a graphical parameter”Warning message in plot.xy(xy, type, ...):
“"add" is not a graphical parameter”Warning message in axis(side = side, at = at, labels = labels, ...):
“"add" is not a graphical parameter”Warning message in axis(side = side, at = at, labels = labels, ...):
“"add" is not a graphical parameter”Warning message in box(...):
“"add" is not a graphical parameter”Warning message in title(...):
“"add" is not a graphical parameter”

In [27]:
par(mar = c(5,5,2,5))
plot(ps, results[1,], type="l", col="red3", xlab=expression(e[sg] setting), ylab = "Number of Communities Detected")

par(new = T)
plot(ps, results[2,], pch=16, axes=F, xlab=NA, ylab=NA, cex=1.2)
axis(side = 4)
mtext(side = 4, line = 3, 'Number genes selected')
legend("topright",
       legend=c("Num Communities", "Semantic Similarity"),
       lty=c(1,0), pch=c(NA, 16), col=c("red3", "black"))



In [ ]: