Venn Diagram Summary of OV TCGA Data

Access gene expression and gene copy number data using UCSC Xena python API and generate venn diagrams showing union between gene and patient sets.

=====================================

Author: Thomas Silvers

Date: 20170917

Parameters
----------

hub         :   "https://..." Xena hub where data located

CNV_dataset :   xena data set with copy number data

RNA_dataset :   xena data set with gene expression data

Returns
----------

*_samples   :   List of patient IDs in data set.

*_probes    :   List of gene names in data set.

*_venn      :   Venn diagram figure.

In [4]:
%matplotlib inline

import xenaPython as xena
from matplotlib_venn import venn2, venn2_circles

In [5]:
def accessXenaData(hub, data_set):
    samples = [x.encode('UTF8') for x in xena.xenaAPI.dataset_samples(hub, data_set)]
    genes = [x.encode('UTF8') for x in xena.xenaAPI.dataset_fields(hub, data_set)]
    return samples, genes

In [6]:
hub = "https://tcga.xenahubs.net"
CNV_dataset = "TCGA.OV.sampleMap/Gistic2_CopyNumber_Gistic2_all_data_by_genes"
RNA_dataset = "TCGA.OV.sampleMap/AgilentG4502A_07_3"

In [7]:
CNV_samples, CNV_probes = accessXenaData(hub, CNV_dataset)
RNA_samples, RNA_probes = accessXenaData(hub, RNA_dataset)

Set correct type for parameters to pass to matplotlib_venn functions:


In [13]:
CNV_samples = set(CNV_samples)
CNV_probes = set(CNV_probes)
RNA_samples = set(RNA_samples)
RNA_probes = set(RNA_probes)

In [14]:
print("Shared Patients Between Data Sets")
patient_venn = venn2([CNV_samples, RNA_samples], ('Patients with copy\nnumber data', 'Patients with gene\nexpression data'), ('c', 'm'))


Shared Patients Between Data Sets

In [15]:
print("Shared Genes Between Data Sets")
gene_venn = venn2([CNV_probes, RNA_probes], ('Agilent array probes\n(gene expression)', 'Whole-genome microarray\n(copy number)'), ('c', 'm'))


Shared Genes Between Data Sets