FLUORESCENCE BINDING ASSAY ANALYSIS

Experiment date: 2015/09/22

Protein: HSA

Fluorescent ligand : dansylamide (lig1)

Xml parsing parts adopted from Sonya's assaytools/examples/fluorescence-binding-assay/Src-gefitinib fluorescence simple.ipynb


In [1]:
import numpy as np
import matplotlib.pyplot as plt
from lxml import etree
import pandas as pd
import os
import matplotlib.cm as cm 
import seaborn as sns
%pylab inline


Populating the interactive namespace from numpy and matplotlib

In [2]:
# Get read and position data of each fluorescence reading section
def get_wells_from_section(path):
    reads = path.xpath("*/Well")
    wellIDs = [read.attrib['Pos'] for read in reads]

    data = [(float(s.text), r.attrib['Pos'])
         for r in reads
         for s in r]

    datalist = {
      well : value
      for (value, well) in data
    }
    
    welllist = [
                [
                 datalist[chr(64 + row) + str(col)]          
                 if chr(64 + row) + str(col) in datalist else None
                 for row in range(1,9)
                ]
                for col in range(1,13)
                ]
                
    return welllist

In [6]:
file_lig1="MI_FLU_hsa_lig1_20150922_150518.xml"
file_name = os.path.splitext(file_lig1)[0]
label = file_name[0:25]
print label


MI_FLU_hsa_lig1_20150922_

In [8]:
root = etree.parse(file_lig1)

#find data sections
Sections = root.xpath("/*/Section")
much = len(Sections)
print "****The xml file " + file_lig1 + " has %s data sections:****" % much
for sect in Sections:
    print sect.attrib['Name']


****The xml file MI_FLU_hsa_lig1_20150922_150518.xml has 3 data sections:****
ex340_em480_topRead
ex340_em480_bottomRead
Abs_600

In [9]:
#Work with topread
TopRead = root.xpath("/*/Section")[0]
welllist = get_wells_from_section(TopRead)

df_topread = pd.DataFrame(welllist, columns = ['A - HSA','B - Buffer','C - HSA','D - Buffer', 'E - HSA','F - Buffer','G - HSA','H - Buffer'])
df_topread.transpose()


Out[9]:
0 1 2 3 4 5 6 7 8 9 10 11
A - HSA 947 636 434 296 212 152 114 92 78 70 70 67
B - Buffer 540 306 184 122 95 96 84 78 83 80 81 84
C - HSA 934 646 435 314 221 150 114 91 75 71 71 70
D - Buffer 572 311 182 130 98 86 80 82 85 78 81 80
E - HSA 935 652 457 307 267 157 108 86 78 71 70 71
F - Buffer 557 310 183 121 99 87 82 82 79 80 80 81
G - HSA 946 631 486 336 245 174 130 102 92 86 197 78
H - Buffer 651 364 210 139 113 107 96 98 93 89 97 93

In [10]:
# To generate cvs file
# df_topread.transpose().to_csv(label + Sections[0].attrib['Name']+ ".csv")

Calculating Molar Fluorescence (MF) of Free Ligand

1. Maximum likelihood curve-fitting

Find the maximum likelihood estimate, $\theta^*$, i.e. the curve that minimizes the squared error $\theta^* = \text{argmin} \sum_i |y_i - f_\theta(x_i)|^2$ (assuming i.i.d. Gaussian noise)

Y = MF*L + BKG

Y: Fluorescence read (Flu unit)

L: Total ligand concentration (uM)

BKG: background fluorescence without ligand (Flu unit)

MF: molar fluorescence of free ligand (Flu unit/ uM)


In [16]:
import numpy as np
from scipy import optimize
import matplotlib.pyplot as plt
%matplotlib inline

def model(x,slope,intercept):
    ''' 1D linear model in the format scipy.optimize.curve_fit expects: '''
    return x*slope + intercept

# generate some data
#X = np.random.rand(1000)
#true_slope=1.0
#true_intercept=0.0
#noise = np.random.randn(len(X))*0.1
#Y = model(X,slope=true_slope,intercept=true_intercept) + noise

#ligand titration
lig1=np.array([200.0000,86.6000,37.5000,16.2000,7.0200, 3.0400, 1.3200, 0.5700, 0.2470, 0.1070, 0.0462, 0.0200])
lig1


Out[16]:
array([  2.00000000e+02,   8.66000000e+01,   3.75000000e+01,
         1.62000000e+01,   7.02000000e+00,   3.04000000e+00,
         1.32000000e+00,   5.70000000e-01,   2.47000000e-01,
         1.07000000e-01,   4.62000000e-02,   2.00000000e-02])

In [19]:
# Since I have 4 replicates
L=np.concatenate((lig1, lig1, lig1, lig1))
len(L)


Out[19]:
48

In [36]:
# Fluorescence read
df_topread.loc[:,("B - Buffer", "D - Buffer", "F - Buffer", "H - Buffer")]


Out[36]:
B - Buffer D - Buffer F - Buffer H - Buffer
0 540 572 557 651
1 306 311 310 364
2 184 182 183 210
3 122 130 121 139
4 95 98 99 113
5 96 86 87 107
6 84 80 82 96
7 78 82 82 98
8 83 85 79 93
9 80 78 80 89
10 81 81 80 97
11 84 80 81 93

In [41]:
B=df_topread.loc[:,("B - Buffer")]
D=df_topread.loc[:,("D - Buffer")]
F=df_topread.loc[:,("F - Buffer")]
H=df_topread.loc[:,("H - Buffer")]

Y = np.concatenate((B.as_matrix(),D.as_matrix(),F.as_matrix(),H.as_matrix()))

In [44]:
(MF,BKG),_ = optimize.curve_fit(model,L,Y)
print('MF: {0:.3f}, BKG: {1:.3f}'.format(MF,BKG))
print('y = {0:.3f} * L + {1:.3f}'.format(MF, BKG))


MF: 2.517, BKG: 86.201
y = 2.517 * L + 86.201

Curve-fitting to binding saturation curve

Fluorescence intensity vs added ligand

LR= ((X+Rtot+KD)-SQRT((X+Rtot+KD)^2-4XRtot))/2

L= X - LR

Y= BKG + MFL + FRMF*LR

Constants

Rtot: receptor concentration (uM)

BKG: background fluorescence without ligand (Flu unit)

MF: molar fluorescence of free ligand (Flu unit/ uM)

Parameters to fit

Kd: dissociation constant (uM)

FR: Molar fluorescence ratio of complex to free ligand (unitless) complex flurescence = FRMFLR

Experimental data

Y: fluorescence measurement X: total ligand concentration L: free ligand concentration


In [82]:
def model2(x,kd,fr):
    ''' 1D linear model in the format scipy.optimize.curve_fit expects: '''
    # lr =((x+rtot+kd)-((x+rtot+kd)**2-4*x*rtot)**(1/2))/2
    # y = bkg + mf*(x - lr) + fr*mf*lr
    bkg = 86.2
    mf = 2.517
    rtot = 0.5
    return bkg + mf*(x - ((x+rtot+kd)-((x+rtot+kd)**2-4*x*rtot)**(1/2))/2) + fr*mf*(((x+rtot+kd)-((x+rtot+kd)**2-4*x*rtot)**(1/2))/2)

In [78]:
# Total HSA concentration (uM)
Rtot = 0.5
#Total ligand titration
X = L
len(X)


Out[78]:
48

In [79]:
# Fluorescence read
df_topread.loc[:,("A - HSA", "C - HSA", "E - HSA", "G - HSA")]


Out[79]:
A - HSA C - HSA E - HSA G - HSA
0 947 934 935 946
1 636 646 652 631
2 434 435 457 486
3 296 314 307 336
4 212 221 267 245
5 152 150 157 174
6 114 114 108 130
7 92 91 86 102
8 78 75 78 92
9 70 71 71 86
10 70 71 70 197
11 67 70 71 78

In [80]:
A=df_topread.loc[:,("A - HSA")]
C=df_topread.loc[:,("C - HSA")]
E=df_topread.loc[:,("E - HSA")]
G=df_topread.loc[:,("G - HSA")]

Y = np.concatenate((A.as_matrix(),C.as_matrix(),E.as_matrix(),G.as_matrix()))
len(Y)


Out[80]:
48

In [81]:
(Kd,FR),_ = optimize.curve_fit(model2, X, Y, p0=(5,1))

print('Kd: {0:.3f}, Fr: {1:.3f}'.format(Kd,FR))


Kd: 30.691, Fr: 2.510

In [ ]: