In [2]:
import kinact
data_log2, data_p_value = kinact.get_example_data()
print data_log2.head()
NetworKIN uses as input two different files
With the function prepare_networkin_files, the needed files with the right layout are produced in a specified directory, based on a list of phosphosites in the format Uniprot-Accession-ID_ResiduePosition.
In [3]:
kinact.networkin.prepare_networkin_files(phospho_sites=data_log2.index.tolist(),
output_dir='./networkin_example_files/',
organism='human')
NetworKIN can be used via the high-throughput version of the web interface. In order to do so, select 'Human - UniProt' or 'Yeast - Uniprot' from the drop-down menu and paste the contents of the file 'site_file.txt' into the dedicated field. It is possible, that several phosphosites cannot be matched correctly due to different versions of the UniProt database (these will have to be removed manually). After clicking the 'Submit'-Button, NetworKIN will try to map the UniProt Identifiers to STRING in order to integrate contextual information for the prediction. On the next page, possible problems with the matching will be displayed and the user will be prompted to select isoforms or homologs. After clicking 'Next' at the bottom of the page, NetworKIN will predict likely upstream kinases.
On the page displaying the results, there is a 'Save' button. Select 'Full Dataset' and save the file as output.txt.
NetworKIN can also be used locally on your machine, which may be easier depending on the number of phosphosites in your dataset. In order to do so, download the NetworKIN release, the NetPhorest release, and the blast algorithm (important: blast to has be the version 2.2.17, which can be found here) from the dedicated websites. Now, NetPhorest has to be compiled, using a gcc compiler version 3.x., like this:
cd "NetPhorest-directory"
cc -03 -o netphorest netphorest.c -lm
The prediction can then be performed with the following command:
python "path to NetworKIN.py" -n "path to netphorest" -b "path to blast" "Taxon Identifier for organism of interest" fasta_file site_file
e.g.:
python ./NetworKIN.py -n ../netphorest/netphorest -b ../blast-2.2.17/bin/blastall 9606 ./fasta_file.txt ./site_file.txt > output.txt
The output file can then be used to create the adjacency matrix with a dedicated function in kinact.
In [4]:
adjacency_matrix = kinact.networkin.get_kinase_targets_from_networkin('./networkin_example_files/output.txt',
add_omnipath=False, score_cut_off=1)
In [5]:
scores, p_values = kinact.networkin.weighted_mean(data_fc=data_log2['5min'],
interactions=adjacency_matrix,
mP=data_log2.values.mean(),
delta=data_log2.values.std())
print scores.sort_values(ascending=False).head()
In [ ]: