In the previous version of this notebook attempted to run the CPAN module for InterologWalk. There were problems installing this and getting it to run locally. It turned out that it had already been run over a large set of proteins at Edinburgh and that the output file was available, which makes this task much easier.
Looking at this file and loading it:
In [1]:
cd ../../InterologWalk/
In [2]:
ls
In [3]:
!head IW_entrez.csv
In [4]:
import csv
As was done with the STRING notebook we will create a ocbio.ppipred.features
object to store the dictionary of interactions.
This can then be pickled and loaded when assembling feature vectors.
In [6]:
f = open("IW_entrez.csv")
featuredict = {}
for line in csv.reader(f,delimiter="\t"):
featuredict[frozenset(line)] = ['1']
f.close()
In [8]:
import sys
In [9]:
sys.path.append("../opencast-bio/")
In [10]:
import ocbio.ppipred
In [11]:
features = ocbio.ppipred.features(featuredict,1)
In [12]:
realkey = featuredict.keys()[0]
fakekey = frozenset(["1234","4321"])
In [13]:
features[realkey]
Out[13]:
In [14]:
features[fakekey]
Out[14]:
In [15]:
import pickle
In [16]:
f = open("human.interologwalk.features.pickle","wb")
pickle.dump(features,f)
f.close()