Class 15 - MCODE


In [ ]:
suppressPackageStartupMessages(library(sna))
library(readr)
suppressPackageStartupMessages(library(igraph))
suppressPackageStartupMessages(library(expm))
suppressPackageStartupMessages(library(ProNet))

In [104]:
test_adjmat <- read.dot("testr.dot")
test_adjmat <- test_adjmat + t(test_adjmat)
test_igraph <- igraph::graph_from_adjacency_matrix(test_adjmat, mode="undirected")
summary(test_igraph)


IGRAPH 59155dd UN-- 13 26 -- 
+ attr: name (v/c)

In [105]:
plot(test_igraph)


Run ProNet::mcode on your test graph with the default parameters; what cluster sizes do you get back?


In [106]:
ProNet::mcode(test_igraph)


$COMPLEX
    1. 1
    2. 2
    3. 3
    4. 4
    5. 5
    6. 6
    7. 7
    1. 9
    2. 10
    3. 11
    4. 12
    5. 13
$score
  1. 5
  2. 4.5

Load the Krogan et al. network edge-list as a dataframe


In [129]:
edge_list_df <- unique(readr::read_tsv("krogan.sif", 
                              col_names=c("protein1","protein2"),
                              col_types=cols()))

head(edge_list_df)


protein1protein2
AAC3 PMR1
AAP1 GUD1
AAP1 TRS31
AAP1 GCN3
AAP1 MDY2
AAR2 SMD1

Make an igraph graph out of your edge-list data frame (undirected); print a summary


In [130]:
big_graph <- igraph::graph_from_data_frame(edge_list_df, directed=FALSE)
summary(big_graph)


IGRAPH 2294070 UN-- 2674 7079 -- 
+ attr: name (v/c)

Plot the degree distribution of your graph, on log-log scale


In [131]:
suppressWarnings(plot(degree_distribution(big_graph), log="xy", xlab="k", ylab="Pk"))


Run ProNet::mcode on your graph, with vwp=0.2 and fdt=0.1; print out the cluster sizes that you get bakc


In [136]:
res <- ProNet::mcode(big_graph, haircut=TRUE, fluff=TRUE, vwp=0.2, fdt=0.1)
s_vals <- sapply(res$COMPLEX, length)
sort(s_vals, decreasing=TRUE)


  1. 1308
  2. 80
  3. 70
  4. 61
  5. 43
  6. 41
  7. 40
  8. 33
  9. 28
  10. 23
  11. 22
  12. 22
  13. 19
  14. 17
  15. 17
  16. 16
  17. 16
  18. 14
  19. 13
  20. 13
  21. 13
  22. 11
  23. 10
  24. 10
  25. 10
  26. 10
  27. 10
  28. 9
  29. 9
  30. 9
  31. 9
  32. 9
  33. 8
  34. 8
  35. 8
  36. 7
  37. 7
  38. 7
  39. 7
  40. 7
  41. 7
  42. 6
  43. 6
  44. 6
  45. 6
  46. 6
  47. 6
  48. 6
  49. 6
  50. 5
  51. 5
  52. 5
  53. 5
  54. 5
  55. 5
  56. 5
  57. 4
  58. 4
  59. 4
  60. 4
  61. 4
  62. 4
  63. 4
  64. 4
  65. 4
  66. 4
  67. 4
  68. 4
  69. 4
  70. 4
  71. 4
  72. 4
  73. 4
  74. 3
  75. 3
  76. 3
  77. 3
  78. 3
  79. 3
  80. 3
  81. 3
  82. 3
  83. 3
  84. 3
  85. 3
  86. 3
  87. 3
  88. 3
  89. 3

Print a histogram of log N(S) vs. log cluster size S.


In [161]:
s_bins <- (0:15)*80/15
hist_res <- hist(s_vals[-1],breaks=s_bins, plot=FALSE)
plot(hist_res$mids, hist_res$counts, log="y", xlab="numprot", ylab="Nclust(numprot)")


Warning message in xy.coords(x, y, xlabel, ylabel, log):
"3 y values <= 0 omitted from logarithmic plot"