=====================================
Parameters
----------
hub : "https://..." Xena hub where data located
CNV_dataset : xena data set with copy number data
RNA_dataset : xena data set with gene expression data
Returns
----------
*_samples : List of patient IDs in data set.
*_probes : List of gene names in data set.
*_venn : Venn diagram figure.
In [4]:
%matplotlib inline
import xenaPython as xena
from matplotlib_venn import venn2, venn2_circles
In [5]:
def accessXenaData(hub, data_set):
samples = [x.encode('UTF8') for x in xena.xenaAPI.dataset_samples(hub, data_set)]
genes = [x.encode('UTF8') for x in xena.xenaAPI.dataset_fields(hub, data_set)]
return samples, genes
In [6]:
hub = "https://tcga.xenahubs.net"
CNV_dataset = "TCGA.OV.sampleMap/Gistic2_CopyNumber_Gistic2_all_data_by_genes"
RNA_dataset = "TCGA.OV.sampleMap/AgilentG4502A_07_3"
In [7]:
CNV_samples, CNV_probes = accessXenaData(hub, CNV_dataset)
RNA_samples, RNA_probes = accessXenaData(hub, RNA_dataset)
Set correct type for parameters to pass to matplotlib_venn functions:
In [13]:
CNV_samples = set(CNV_samples)
CNV_probes = set(CNV_probes)
RNA_samples = set(RNA_samples)
RNA_probes = set(RNA_probes)
In [14]:
print("Shared Patients Between Data Sets")
patient_venn = venn2([CNV_samples, RNA_samples], ('Patients with copy\nnumber data', 'Patients with gene\nexpression data'), ('c', 'm'))
In [15]:
print("Shared Genes Between Data Sets")
gene_venn = venn2([CNV_probes, RNA_probes], ('Agilent array probes\n(gene expression)', 'Whole-genome microarray\n(copy number)'), ('c', 'm'))