Competition assay analysis and thoughts

Here we will analyze two competition assay conducted as a rough beginning to understand how to best design competition assays to the fluorescent kinase inhibitors (bosutinib, bosutinib isomer, erlotinib, and gefitinib) used in other assays in this repository.

The first (1st) part of this will be looking at data collected trying to compete off bosutinib from Src kinase with imatinib (conducted on March 11, 2015). The second (2nd) part of this will be looking at data collected trying to compete off gefitinib from Src kinase with imatinib (conducted on October 30, 2015). The third (3rd) part will be some simple modeling to see if these experiments follow our expectations and how we can better design the experiments to get better results from the competition assay. Then in a fourth (4th) section we'll work a little on a PYMC model to get affinities from the competition assay.


In [1]:
#import needed libraries
import re
import os
from lxml import etree
import pandas as pd
import pymc
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline


Couldn't import dot_parser, loading of dot files will not be possible.
:0: FutureWarning: IPython widgets are experimental and may change in the future.

Bosutinib Assay

The first attempt at a Bosutinib-Imatinib competition assay was on March 11, 2015. The full description of the assay can be found here. In short (and very similar to as described in the lab-protocols repository) 100 uL of 0.5 $\mu$M Src with a titration of Bosutinib up to 20 $\mu$M in every other row was prepared. In one of two plates 10 $\mu$M Imatinib was added to the whole plate to try to compete off the Bosutinib.

importing and plotting data the clunky way for transparency, will change to use platereader.py once it is slightly nicer.


In [2]:
def get_wells_from_section(path):
    reads = path.xpath("*/Well")
    wellIDs = [read.attrib['Pos'] for read in reads]

    data = [(float(s.text), r.attrib['Pos'])
         for r in reads
         for s in r]

    datalist = {
      well : value
      for (value, well) in data
    }
    
    welllist = [
                [
                 datalist[chr(64 + row) + str(col)]          
                 if chr(64 + row) + str(col) in datalist else None
                 for row in range(1,9)
                ]
                for col in range(1,13)
                ]
                
    return welllist

In [3]:
file_BOS= "data/2015-03-11 18-35-16_plate_1.xml"
file_name = os.path.splitext(file_BOS)[0]

In [4]:
root = etree.parse(file_BOS)

In [5]:
Sections = root.xpath("/*/Section")
much = len(Sections)
print "****The xml file " + file_BOS + " has %s data sections:****" % much
for sect in Sections:
    print sect.attrib['Name']


****The xml file data/2015-03-11 18-35-16_plate_1.xml has 7 data sections:****
280_TopRead
280_BottomRead
Abs_280
350_TopRead
350_BottomRead
Abs_350
Abs_480

In [6]:
#Just going to work with topread for now
TopRead = root.xpath("/*/Section")[0]
welllist = get_wells_from_section(TopRead)
Bos_dataframe = pd.DataFrame(welllist, columns = ['A - Src','B - Buffer','C - Src','D - Buffer', 'E - Src','F - Buffer','G - Src','H - Buffer'])

In [7]:
sns.set_palette("Paired", 10)
sns.set_context("notebook", rc={"lines.linewidth": 2.5})

Bos_dataframe.plot(figsize=(6, 4), title=file_name)
plt.xlim(-0.5,11.5)


Out[7]:
(-0.5, 11.5)

In [8]:
Bos_dataframe


Out[8]:
A - Src B - Buffer C - Src D - Buffer E - Src F - Buffer G - Src H - Buffer
0 451.0 382.0 449.0 358.0 473.0 400.0 466.0 346.0
1 366.0 179.0 385.0 170.0 378.0 185.0 410.0 157.0
2 325.0 93.0 341.0 96.0 329.0 104.0 361.0 97.0
3 295.0 79.0 311.0 77.0 315.0 85.0 316.0 78.0
4 289.0 65.0 292.0 68.0 301.0 70.0 323.0 65.0
5 277.0 59.0 285.0 58.0 289.0 60.0 304.0 58.0
6 270.0 55.0 266.0 54.0 265.0 63.0 275.0 54.0
7 272.0 57.0 200.0 54.0 299.0 60.0 211.0 55.0
8 272.0 59.0 123.0 54.0 127.0 61.0 127.0 51.0
9 261.0 55.0 85.0 54.0 88.0 60.0 89.0 53.0
10 255.0 56.0 69.0 55.0 70.0 56.0 68.0 54.0
11 254.0 58.0 57.0 55.0 56.0 55.0 51.0 54.0

In [9]:
file_BOS_IMA= "data/Ima_WIP_SMH_SrcBos_Extend_013015_mdfx_20150311_18.xml"
file_name = os.path.splitext(file_BOS_IMA)[0]
root = etree.parse(file_BOS_IMA)
Sections = root.xpath("/*/Section")
much = len(Sections)
print "****The xml file " + file_BOS_IMA + " has %s data sections:****" % much
for sect in Sections:
    print sect.attrib['Name']


****The xml file data/Ima_WIP_SMH_SrcBos_Extend_013015_mdfx_20150311_18.xml has 7 data sections:****
280_TopRead
280_BottomRead
Abs_280
350_TopRead
350_BottomRead
Abs_350
Abs_480

In [10]:
#Just going to work with topread for now
TopRead = root.xpath("/*/Section")[0]
welllist = get_wells_from_section(TopRead)
BosIma_dataframe = pd.DataFrame(welllist, columns = ['A - Src','B - Buffer','C - Src','D - Buffer', 'E - Src','F - Buffer','G - Src','H - Buffer'])
sns.set_palette("Paired", 10)
sns.set_context("notebook", rc={"lines.linewidth": 2.5})

BosIma_dataframe.plot(figsize=(6, 4), title=file_name)
plt.xlim(-0.5,11.5)


Out[10]:
(-0.5, 11.5)

In [11]:
plt.plot(BosIma_dataframe[:].values, 'r');
plt.plot(Bos_dataframe[:].values, 'k');
plt.text(8,450,'Bosutinib',fontsize=15)
plt.text(8,420,'Imatinib + Bosutinib',fontsize=15,color='red')


Out[11]:
<matplotlib.text.Text at 0x120f284d0>

Gefitinib Assay

The first attempt at a Gefitinib-Imatinib competition assay was on October 30, 2015. The full description of the assay can be found here. In short (and very similar to as described in the lab-protocols repository) 100 uL of 0.5 $\mu$M Src with a titration of Gefitinib up to 20 $\mu$M in every other row was prepared. In one of two plates 10 $\mu$M Imatinib was added to the whole plate to try to compete off the Gefitinib. Note the documentation here could be better, which could be why this data doesn't look particularly great.


In [12]:
file_GEF = "data/Gef_2015-10-30 17-55-48_plate_1.xml"
file_name = os.path.splitext(file_GEF)[0]

In [13]:
root = etree.parse(file_GEF)

In [14]:
Sections = root.xpath("/*/Section")
much = len(Sections)
print "****The xml file " + file_GEF + " has %s data sections:****" % much
for sect in Sections:
    print sect.attrib['Name']


****The xml file data/Gef_2015-10-30 17-55-48_plate_1.xml has 7 data sections:****
280_TopRead
280_BottomRead
Abs_280
350_TopRead
350_BottomRead
Abs_350
Abs_480

In [15]:
#Just going to work with topread for now
TopRead = root.xpath("/*/Section")[0]
welllist = get_wells_from_section(TopRead)
Gef_dataframe = pd.DataFrame(welllist, columns = ['A - Src','B - Buffer','C - Src','D - Buffer', 'E - Src','F - Buffer','G - Src','H - Buffer'])
Gef_dataframe.plot(figsize=(6, 4), title=file_name)
plt.xlim(-0.5,11.5)


Out[15]:
(-0.5, 11.5)

In [16]:
file_GEF_IMA= "data/GefIma_2015-10-30 17-51-13_plate_1.xml"
file_name = os.path.splitext(file_GEF_IMA)[0]
root = etree.parse(file_GEF_IMA)
Sections = root.xpath("/*/Section")
much = len(Sections)
print "****The xml file " + file_GEF_IMA + " has %s data sections:****" % much
for sect in Sections:
    print sect.attrib['Name']


****The xml file data/GefIma_2015-10-30 17-51-13_plate_1.xml has 7 data sections:****
280_TopRead
280_BottomRead
Abs_280
350_TopRead
350_BottomRead
Abs_350
Abs_480

In [17]:
#Just going to work with topread for now
TopRead = root.xpath("/*/Section")[0]
welllist = get_wells_from_section(TopRead)
GefIma_dataframe = pd.DataFrame(welllist, columns = ['A - Src','B - Buffer','C - Src','D - Buffer', 'E - Src','F - Buffer','G - Src','H - Buffer'])
sns.set_palette("Paired", 10)
sns.set_context("notebook", rc={"lines.linewidth": 2.5})

GefIma_dataframe.plot(figsize=(6, 4), title=file_name)
plt.xlim(-0.5,11.5)


Out[17]:
(-0.5, 11.5)

In [18]:
plt.plot(GefIma_dataframe[:].values, 'r');
plt.plot(Gef_dataframe[:].values, 'k');
plt.text(8,230,'Gefitinib',fontsize=15)
plt.text(8,210,'Imatinib + Gefitinib',fontsize=15,color='red')


Out[18]:
<matplotlib.text.Text at 0x12121e210>

Modeled Data

So now let's look at what our expected data might look like. Here we are looking at inhibitor affinities for Src. Some initial placeholder data from here again: http://www.guidetopharmacology.org/GRAC/LigandScreenDisplayForward?ligandId=5710&screenId=2


In [19]:
Kd_Bos = 1.0e-9 # M
Kd_Gef = 3800e-9 # M
Kd_Ima = 3000e-9 # M

From our assay setup we know the Src concentration is 0.5 $\mu$M.


In [20]:
Ptot = 0.5e-6 # M
Ltot = 20.0e-6 / np.array([10**(float(i)/2.0) for i in range(12)]) # M

First let's just plot our two component binding for Bosutinib and Gefitinib.


In [21]:
# Now we can use this to define a function that gives us PL from Kd, Ptot, and Ltot.
def two_component_binding(Kd, Ptot, Ltot):
    """
    Parameters
    ----------
    Kd : float
        Dissociation constant
    Ptot : float
        Total protein concentration
    Ltot : float
        Total ligand concentration
        
    Returns
    -------
    P : float
        Free protein concentration
    L : float
        Free ligand concentration
    PL : float
        Complex concentration
    """
                                    
    PL = 0.5 * ((Ptot + Ltot + Kd) - np.sqrt((Ptot + Ltot + Kd)**2 - 4*Ptot*Ltot))  # complex concentration (uM)
    P = Ptot - PL; # free protein concentration in sample cell after n injections (uM)                                                                                                                                                                                                                          
    L = Ltot - PL; # free ligand concentration in sample cell after n injections (uM)                                                                                                                                                                                                                           
    return [P, L, PL]

In [22]:
[Lb, Pb, PLb] = two_component_binding(Kd_Bos, Ptot, Ltot)
[Lg, Pg, PLg] = two_component_binding(Kd_Gef, Ptot, Ltot)

In [23]:
# y will be complex concentration
# x will be total ligand concentration
Bos, = plt.semilogx(Ltot,PLb,'green', label='Bosutinib')
Gef, = plt.semilogx(Ltot,PLg,'violet', label = 'Gefitinib')
plt.xlabel('$[L]_{tot}$')
plt.ylabel('$[PL]$')
plt.ylim(0,6e-7)
plt.legend(loc=3);


Now let's see how we would expect Imatinib to effect this.

From our assay setup we know the Imatinib concentration is


In [24]:
Lima =  10e-6 # M

In [25]:
#Competitive binding function
def three_component_competitive_binding(Ptot, Ltot, Kd_L, Atot, Kd_A):
    """
    Parameters
    ----------
    Ptot : float
        Total protein concentration
    Ltot : float
        Total tracer(fluorescent) ligand concentration
    Kd_L : float
        Dissociation constant
    Atot : float
        Total competitive ligand concentration
    Kd_A : float
        Dissociation constant
        
    Returns
    -------
    P : float
        Free protein concentration
    L : float
        Free ligand concentration
    A : float
        Free ligand concentration
    PL : float
        Complex concentration
    Kd_L_app : float
        Apparent dissociation constant of L in the presence of A
        
    Usage
    -----
    [P, L, A, PL, Kd_L_app] = three_component_competitive_binding(Ptot, Ltot, Kd_L, Atot, Kd_A)
    """
    Kd_L_app = Kd_L*(1+Atot/Kd_A)                                
    PL = 0.5 * ((Ptot + Ltot + Kd_L_app) - np.sqrt((Ptot + Ltot + Kd_L_app)**2 - 4*Ptot*Ltot))  # complex concentration (uM)
    P = Ptot - PL; # free protein concentration in sample cell after n injections (uM)                                                                                                                                                                                                                          
    L = Ltot - PL; # free tracer ligand concentration in sample cell after n injections (uM)
    A = Atot - PL; # free competitive ligand concentration in sample cell after n injections (uM)
    return [P, L, A, PL, Kd_L_app]

In [26]:
[Pbi, Lbi, Abi, PLbi, Kd_bima] = three_component_competitive_binding(Ptot, Ltot, Kd_Bos, Lima, Kd_Ima)
[Pgi, Lgi, Agi, PLgi, Kd_gima] = three_component_competitive_binding(Ptot, Ltot, Kd_Gef, Lima, Kd_Ima)

In [27]:
# y will be complex concentration
# x will be total ligand concentration
plt.title('Src competition assay')
Bos, = plt.semilogx(Ltot,PLb,'green', label='Bosutinib')
Bos_Ima, = plt.semilogx(Ltot,PLbi,'cyan', label='Bosutinib + Ima')
Gef, = plt.semilogx(Ltot,PLg,'violet', label = 'Gefitinib')
Gef_Ima, = plt.semilogx(Ltot,PLgi,'pink', label = 'Gefitinib + Ima')
plt.xlabel('$[L]_{tot}$')
plt.ylabel('$[PL]$')
plt.ylim(0,6e-7)
plt.legend(loc=3);


HMMM.

So I was right to think that Gefitinib should work despite the fact that Bosutinib didn't, but...

Now let's try modeling new experiments with Abl before we actually do them!


In [28]:
#Using expected Kd's from same website as above
Kd_Bos_Abl = 0.1e-9 # M
Kd_Gef_Abl = 480e-9 # M
Kd_Ima_Abl = 21.0e-9 # M

In [29]:
[Lb_Abl, Pb_Abl, PLb_Abl] = two_component_binding(Kd_Bos_Abl, Ptot, Ltot)
[Lg_Abl, Pg_Abl, PLg_Abl] = two_component_binding(Kd_Gef_Abl, Ptot, Ltot)

In [30]:
[Pbi_Abl, Lbi_Abl, Abi_Abl, PLbi_Abl, Kd_bima_Abl] = three_component_competitive_binding(Ptot, Ltot, Kd_Bos_Abl, Lima, Kd_Ima_Abl)
[Pgi_Abl, Lgi_Abl, Agi_Abl, PLgi_Abl, Kd_gima_Abl] = three_component_competitive_binding(Ptot, Ltot, Kd_Gef_Abl, Lima, Kd_Ima_Abl)

In [31]:
# y will be complex concentration
# x will be total ligand concentration
Bos, = plt.semilogx(Ltot,PLb_Abl,'green', label='Bosutinib')
Bos_Ima, = plt.semilogx(Ltot,PLbi_Abl,'cyan', label='Bosutinib + Ima')
Gef, = plt.semilogx(Ltot,PLg_Abl,'violet', label = 'Gefitinib')
Gef_Ima, = plt.semilogx(Ltot,PLgi_Abl,'pink', label = 'Gefitinib + Ima')
plt.title('Abl competition assay')
plt.xlabel('$[L]_{tot}$')
plt.ylabel('$[PL]$')
plt.ylim(0,6e-7)
plt.legend(loc=3);


Looks promising!

Let's check out our new data set based on this.


In [32]:
def get_wells_from_section(path):
    reads = path.xpath("*/Well")
    wellIDs = [read.attrib['Pos'] for read in reads]

    data = [(s.text, r.attrib['Pos'])
         for r in reads
         for s in r]

    datalist = {
      well : value
      for (value, well) in data
    }
    
    welllist = [
                [
                 datalist[chr(64 + row) + str(col)]          
                 if chr(64 + row) + str(col) in datalist else None
                 for row in range(1,9)
                ]
                for col in range(1,13)
                ]
                
    return welllist

In [33]:
file_ABL_GEF= "data/Abl Gef gain 120 bw1020 2016-01-19 15-59-53_plate_1.xml"
file_name = os.path.splitext(file_ABL_GEF)[0]
root = etree.parse(file_ABL_GEF)
Sections = root.xpath("/*/Section")
much = len(Sections)
print "****The xml file " + file_ABL_GEF + " has %s data sections:****" % much
for sect in Sections:
    print sect.attrib['Name']


****The xml file data/Abl Gef gain 120 bw1020 2016-01-19 15-59-53_plate_1.xml has 7 data sections:****
280_TopRead
280_BottomRead
Abs_280
350_TopRead
350_BottomRead
Abs_350
Abs_480

In [34]:
#Just going to work with topread for now
TopRead = root.xpath("/*/Section")[0]
welllist = get_wells_from_section(TopRead)
AblGef_dataframe = pd.DataFrame(welllist, columns = ['A - Abl','B - Buffer','C - Abl','D - Buffer', 'E - Abl','F - Buffer','G - Abl','H - Buffer'])
#AN ERROR FOR 'OVERS' COMES UP UNLESS THE NEXT LINE IS HERE
#THE MAX VALUE IS TAKEN FROM THE MAX VALUE FOR THE ABL GEF IMA DATA
dataframe_rep = AblGef_dataframe.replace({'OVER':'64060.0'})
AblGef_dataframe


Out[34]:
A - Abl B - Buffer C - Abl D - Buffer E - Abl F - Buffer G - Abl H - Buffer
0 62213 19784 62890 19797 62821 20594 63317 17635
1 62497 17332 62936 18205 62927 18102 63343 14681
2 63194 12950 63056 13223 62794 13049 63251 10586
3 63017 10782 62743 11030 62841 10622 63045 9235
4 56558 10186 62107 10257 61824 9884 55471 8785
5 34259 9837 38003 9884 38786 9620 35771 8796
6 22106 9861 24868 9716 24637 9757 23488 8789
7 14589 9659 16880 10010 17193 9518 17755 8612
8 11475 9601 12311 10035 13272 9758 12312 8625
9 10505 9669 11388 9891 11702 9649 10925 8513
10 9771 9722 10457 10630 10585 9889 10062 8840
11 8652 9757 9712 9396 9653 9578 9429 8671

In [35]:
#dataframe_rep[['fluorescence']] = dataframe_rep[['fluorescence']].astype('float')
dataframe_rep = dataframe_rep.astype('float')
sns.set_palette("Paired", 10)
sns.set_context("notebook", rc={"lines.linewidth": 2.5})

dataframe_rep.plot(figsize=(6, 4), title=file_name)
plt.xlim(-0.5,11.5);



In [36]:
file_ABL_GEF_IMA= "data/Abl Gef Ima gain 120 bw1020 2016-01-19 16-22-45_plate_1.xml"
file_name = os.path.splitext(file_ABL_GEF_IMA)[0]
root = etree.parse(file_ABL_GEF_IMA)
Sections = root.xpath("/*/Section")
much = len(Sections)
print "****The xml file " + file_ABL_GEF_IMA + " has %s data sections:****" % much
for sect in Sections:
    print sect.attrib['Name']


****The xml file data/Abl Gef Ima gain 120 bw1020 2016-01-19 16-22-45_plate_1.xml has 7 data sections:****
280_TopRead
280_BottomRead
Abs_280
350_TopRead
350_BottomRead
Abs_350
Abs_480

In [37]:
#Just going to work with topread for now
TopRead = root.xpath("/*/Section")[0]
welllist = get_wells_from_section(TopRead)
AblGefIma_dataframe = pd.DataFrame(welllist, columns = ['A - Abl','B - Buffer','C - Abl','D - Buffer', 'E - Abl','F - Buffer','G - Abl','H - Buffer'])
sns.set_palette("Paired", 10)
sns.set_context("notebook", rc={"lines.linewidth": 2.5})
AblGefIma_dataframe = AblGefIma_dataframe.astype('float')

AblGefIma_dataframe.plot(figsize=(6, 4), title=file_name)
plt.xlim(-0.5,11.5)


Out[37]:
(-0.5, 11.5)

In [38]:
AblGefIma_dataframe.values.max()


Out[38]:
63432.0

In [39]:
plt.plot(AblGefIma_dataframe[:].values, 'r');
plt.plot(AblGef_dataframe[:].values, 'k');
plt.text(8,60000,'Gefitinib (ABL)',fontsize=15)
plt.text(8,55000,'Imatinib + Gefitinib (ABL)',fontsize=15,color='red')
plt.savefig('Abl_Gef_Ima_Jan2016_repeat.png')



In [ ]:


In [ ]:


In [ ]: