In [1]:
suppressPackageStartupMessages(library(igraph))
Step 1: load in the SIF file as a data frame sif_data, using the read.table function
In [2]:
sif_data <- read.table("shared/pathway_commons.sif",
sep="\t",
header=FALSE,
stringsAsFactors=FALSE,
col.names=c("species1",
"interaction_type",
"species2"),
quote="",
comment.char="")
Step 2: restrict the interactions to protein-protein undirected ("in-complex-with", "interacts-with"), using the %in% operator and using array indexing [, and include only the two species columns. The restricted data frame should be called interac_ppi.
In [3]:
interac_ppi <- sif_data[sif_data$interaction_type %in% c("in-complex-with",
"interacts-with"), c(1,3)]
Step 3: restrict the data frame to only the unique interaction pairs of proteins (ignoring the interaction type), using the unique function. Make an igraph Graph object from the data frame, using graph_from_data_frame.
In [4]:
interac_ppi_unique <- unique(interac_ppi)
ppi_igraph <- graph_from_data_frame(interac_ppi_unique, directed=FALSE)
Map the components of the graph ppi_igraph using the igraph function components. That will return a list which you should assign to object name component_res_list. Get the csize member of the list, which will be a vector of the sizes of the components of the graph. Call max on that vector to get the size of the giant component of the PPI.
In [5]:
## call the igraph function `components` on the `ppi_igraph` object; name
## resulting object `component_res_list`
component_res_list <- components(ppi_igraph)
In [7]:
## obtain the list item in the slot named `csize`, and name the
## resulting object `component_sizes_vec`
component_sizes_vec <- component_res_list$csize
In [9]:
## use the `max` function to find the size of the giant component
max(component_sizes_vec)
Advanced code-spellunking question: go to the GitHub repo for igraph (https://github.com/igraph), and find the code components.c. For the weakly connected components, is it doing a BFS or DFS?