Procedural programming in python

Topics

  • Flow control, part 2
    • Functions
    • In class exercise:
      • Functionalize this!
    • From nothing to something:
      • Pairwise correlation between rows in a pandas dataframe
      • Sketch of the process
      • In class exercise:
        • Write the code!
      • Rejoining, sharing ideas, problems, thoughts


Flow control

Flow control figure

Flow control refers how to programs do loops, conditional execution, and order of functional operations.

If

If statements can be use to execute some lines or block of code if a particular condition is satisfied. E.g. Let's print something based on the entries in the list.


In [ ]:
instructors = ['Dave', 'Jim', 'Dorkus the Clown']

if 'Dorkus the Clown' in instructors:
    print('#fakeinstructor')

There is a special do nothing word: pass that skips over some arm of a conditional, e.g.


In [ ]:
if 'Jim' in instructors:
    print("Congratulations!  Jim is teaching, your class won't stink!")
else:
    pass

For

For loops are the standard loop, though while is also common. For has the general form:

for items in list:
    do stuff

For loops and collections like tuples, lists and dictionaries are natural friends.


In [ ]:
for instructor in instructors:
    print(instructor)

You can combine loops and conditionals:


In [ ]:
for instructor in instructors:
    if instructor.endswith('Clown'):
        print(instructor + " doesn't sound like a real instructor name!")
    else:
        print(instructor + " is so smart... all those gooey brains!")

range()

Since for operates over lists, it is common to want to do something like:

NOTE: C-like
for (i = 0; i < 3; ++i) {
    print(i);
}

The Python equivalent is:

for i in [0, 1, 2]:
    do something with i

What happens when the range you want to sample is big, e.g.

NOTE: C-like
for (i = 0; i < 1000000000; ++i) {
    print(i);
}

That would be a real pain in the rear to have to write out the entire list from 1 to 1000000000.

Enter, the range() function. E.g. range(3) is [0, 1, 2]


In [1]:
sum = 0
for i in range(10):
    sum += i
print(sum)


45

In [ ]:
data.head()

Now, use your code from above for the following URLs and filenames

URL filename csv_filename
http://faculty.washington.edu/dacb/HCEPDB_moldata_set1.zip HCEPDB_moldata_set1.zip HCEPDB_moldata_set1.csv
http://faculty.washington.edu/dacb/HCEPDB_moldata_set2.zip HCEPDB_moldata_set2.zip HCEPDB_moldata_set2.csv
http://faculty.washington.edu/dacb/HCEPDB_moldata_set3.zip HCEPDB_moldata_set3.zip HCEPDB_moldata_set3.csv

What pieces of the data structures and flow control that we talked about earlier can you use?


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:

How did you solve this problem?


Functions

For loops let you repeat some code for every item in a list. Functions are similar in that they run the same lines of code for new values of some variable. They are different in that functions are not limited to looping over items.

Functions are a critical part of writing easy to read, reusable code.

Create a function like:

def function_name (parameters):
    """
    docstring
    """
    function expressions
    return [variable]

Note: Sometimes I use the word argument in place of parameter.

Here is a simple example. It prints a string that was passed in and returns nothing.


In [20]:
def print_string(str):
    """This prints out a string passed as the parameter."""
    print(str)
    for c in str:
        print(c)
        if c == 'r':
            break
    print("done")
    return

In [21]:
print_string("string")


string
s
t
r
done

To call the function, use:

print_string("Dave is awesome!")

Note: The function has to be defined before you can call it!


In [ ]:
print_string("Dave is awesome!")

If you don't provide an argument or too many, you get an error.


In [22]:
print_string()


---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-22-ad26026057f7> in <module>()
----> 1 print_string()

TypeError: print_string() missing 1 required positional argument: 'str'

Parameters (or arguments) in Python are all passed by reference. This means that if you modify the parameters in the function, they are modified outside of the function.

See the following example:

def change_list(my_list):
   """This changes a passed list into this function"""
   my_list.append('four');
   print('list inside the function: ', my_list)
   return

my_list = [1, 2, 3];
print('list before the function: ', my_list)
change_list(my_list);
print('list after the function: ', my_list)

In [23]:
def change_list(my_list):
   """This changes a passed list into this function"""
   my_list.append('four');
   print('list inside the function: ', my_list)
   return

my_list = [1, 2, 3];
print('list before the function: ', my_list)
change_list(my_list);
print('list after the function: ', my_list)


list before the function:  [1, 2, 3]
list inside the function:  [1, 2, 3, 'four']
list after the function:  [1, 2, 3, 'four']

Variables have scope: global and local

In a function, new variables that you create are not saved when the function returns - these are local variables. Variables defined outside of the function can be accessed but not changed - these are global variables, Note there is a way to do this with the global keyword. Generally, the use of global variables is not encouraged, instead use parameters.

my_global_1 = 'bad idea'
my_global_2 = 'another bad one'
my_global_3 = 'better idea'

def my_function():
    print(my_global_1)
    my_global_2 = 'broke your global, man!'
    global my_global_3
    my_global_3 = 'still a better idea'
    return

my_function()
print(my_global_2)
print(my_global_3)

In [25]:
my_global_1 = 'bad idea'
my_global_2 = 'another bad one'
my_global_3 = 'better idea'

def my_function():
    print(my_global_1)
    my_global_2 = 'broke your global, man!'
    print(my_global_2)
    global my_global_3
    my_global_3 = 'still a better idea'
    return

my_function()
print(my_global_2)
print(my_global_3)


bad idea
broke your global, man!
another bad one
still a better idea

In general, you want to use parameters to provide data to a function and return a result with the return. E.g.

def sum(x, y):
    my_sum = x + y
    return my_sum

If you are going to return multiple objects, what data structure that we talked about can be used? Give and example below.


In [30]:
def a_function(parameter):
    return None

In [31]:
foo = a_function('bar')
print(foo)


None

Parameters have three different types:

type behavior
required positional, must be present or error, e.g. my_func(first_name, last_name)
keyword position independent, e.g. my_func(first_name, last_name) can be called my_func(first_name='Dave', last_name='Beck') or my_func(last_name='Beck', first_name='Dave')
default keyword params that default to a value if not provided

In [32]:
def print_name(first, last='the Clown'):
    print('Your name is %s %s' % (first, last))
    return

Take a minute and play around with the above function. Which are required? Keyword? Default?


In [34]:
def massive_correlation_analysis(data, method='pearson'):
    pass
    return

Functions can contain any code that you put anywhere else including:

  • if...elif...else
  • for...else
  • while
  • other function calls

In [39]:
def print_name_age(first, last, age):
    print_name(first, last)
    print('Your age is %d' % (age))
    print('Your age is ' + str(age))
    if age > 35:
        print('You are really old.')
    return

In [40]:
print_name_age(age=40, last='Beck', first='Dave')


Your name is Dave Beck
Your age is 40
Your age is 40
You are really old.

Once you have some code that is functionalized and not going to change, you can move it to a file that ends in .py, check it into version control, import it into your notebook and use it!

Let's do this now for the above two functions.

...

See you after the break!

Import the function...


In [ ]:

Call them!


In [ ]:


Hacky Hack Time with Functions!

Notes from last class:

  • The os package has tools for checking if a file exists: os.path.exists
    import os
    filename = 'HCEPDB_moldata.zip'
    if os.path.exists(filename):
      print("wahoo!")
  • Use the requests package to get the file given a url (got this from the requests docs)
    import requests
    url = 'http://faculty.washington.edu/dacb/HCEPDB_moldata.zip'
    req = requests.get(url)
    assert req.status_code == 200 # if the download failed, this line will generate an error
    with open(filename, 'wb') as f:
      f.write(req.content)
  • Use the zipfile package to decompress the file while reading it into pandas
    import pandas as pd
    import zipfile
    csv_filename = 'HCEPDB_moldata.csv'
    zf = zipfile.ZipFile(filename)
    data = pd.read_csv(zf.open(csv_filename))

Here was my solution

import os
import requests
import pandas as pd
import zipfile

filename = 'HCEPDB_moldata.zip'
url = 'http://faculty.washington.edu/dacb/HCEPDB_moldata.zip'
csv_filename = 'HCEPDB_moldata.csv'

if os.path.exists(filename):
    pass
else:
    req = requests.get(url)
    assert req.status_code == 200 # if the download failed, this line will generate an error
    with open(filename, 'wb') as f:
        f.write(req.content)

zf = zipfile.ZipFile(filename)
data = pd.read_csv(zf.open(csv_filename))

In class exercise

5-10 minutes

Objective: How would you functionalize the code for downloading, unzipping, and making a dataframe?

Bonus! Add the the code to a file HCEPDB_utils.py and import it!


In [ ]:
def download_if_not_exists(filename):
    if os.path.exists(filename):
        pass
    else:
        req = requests.get(url)
        assert req.status_code == 200 # if the download failed, this line will generate an error
        with open(filename, 'wb') as f:
            f.write(req.content)

In [ ]:


In [ ]:


In [ ]:


In [ ]:

How many functions did you use?

Why did you choose to use functions for these pieces?


From something to nothing

Task: Compute the pairwise Pearson correlation between rows in a dataframe.

Let's say we have three molecules (A, B, C) with three measurements each (v1, v2, v3). So for each molecule we have a vector of measurements:

$$X=\begin{bmatrix} X_{v_{1}} \\ X_{v_{2}} \\ X_{v_{3}} \\ \end{bmatrix} $$

Where X is a molecule and the components are the values for each of the measurements. These make up the rows in our matrix.

Often, we want to compare molecules to determine how similar or different they are. One measure is the Pearson correlation.

Pearson correlation:

Expressed graphically, when you plot the paired measurements for two samples (in this case molecules) against each other you can see positively correlated, no correlation, and negatively correlated. Eg.

Simple input dataframe (note when you are writing code it is always a good idea to have a simple test case where you can readily compute by hand or know the output):

index v1 v2 v3
A -1 0 1
B 1 0 -1
C .5 0 .5
  • If the above is a dataframe what shape and size is the output?
  • Whare are some unique features of the output?

For our test case, what will the output be?

A B C
A 1 -1 0
B -1 1 0
C 0 0 1

Let's sketch the idea...


In [ ]:


In [ ]:

In class exercise

20-30 minutes

Objectives:

  1. Write code using functions to compute the pairwise Pearson correlation between rows in a pandas dataframe. You will have to use for and possibly if.
  2. Use a cell to test each function with an input that yields an expected output. Think about the shape and values of the outputs.
  3. Put the code in a .py file in the directory with the Jupyter notebook, import and run!

In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [1]:
import pandas as pd
import math

In [7]:
df = pd.read_csv('HCEPDB_moldata.csv')


/Users/Mandy/anaconda/lib/python3.6/site-packages/IPython/core/interactiveshell.py:2717: DtypeWarning: Columns (0,3,4,5,6,7,8,9) have mixed types. Specify dtype option on import or set low_memory=False.
  interactivity=interactivity, compiler=compiler, result=result)

In [8]:
df


Out[8]:
0 1 2 3 4 5 6 7 8 9 10
0 id SMILES_str stoich_str mass pce voc jsc e_homo_alpha e_gap_alpha e_lumo_alpha tmp_smiles_str
1 655365 C1C=CC=C1c1cc2[se]c3c4occc4c4nsnc4c3c2cn1 C18H9N3OSSe 394.3151 5.16195320211971 0.86760078740294 91.5675749599 -5.46760078740294 2.02294443593306 -3.44465635146988 C1=CC=C(C1)c1cc2[se]c3c4occc4c4nsnc4c3c2cn1
2 1245190 C1C=CC=C1c1cc2[se]c3c(ncc4ccccc34)c2c2=C[SiH2]... C22H15NSeSi 400.4135 5.2613977233692 0.50482419467609 160.40154923845 -5.10482419467609 1.63075003826037 -3.47407415641572 C1=CC=C(C1)c1cc2[se]c3c(ncc4ccccc34)c2c2=C[SiH...
3 21847 C1C=c2ccc3c4c[nH]cc4c4c5[SiH2]C(=Cc5oc4c3c2=C1... C24H17NOSi 363.4903 0 0 197.47477990435 -4.53952567287262 1.46215815756611 -3.07736751530651 C1=CC=C(C1)C1=Cc2oc3c(c2[SiH2]1)c1c[nH]cc1c1cc...
4 65553 [SiH2]1C=CC2=C1C=C([SiH2]2)C1=Cc2[se]ccc2[SiH2]1 C12H12SeSi3 319.4448 6.13829369542661 0.63027445338351 149.88754514825 -5.23027445338351 1.6822495770534 -3.54802487633011 C1=CC2=C([SiH2]1)C=C([SiH2]2)C1=Cc2[se]ccc2[Si...
5 720918 C1C=c2c3ccsc3c3[se]c4cc(oc4c3c2=C1)C1=CC=CC1 C20H12OSSe 379.3398 1.99136566470237 0.242119009470801 126.58134716045 -4.8421190094708 1.80943882203271 -3.03268018743809 C1=CC=C(C1)c1cc2[se]c3c4sccc4c4=CCC=c4c3c2o1
6 1310744 C1C=CC=C1c1cc2[se]c3c(c4nsnc4c4ccncc34)c2c2ccc... C24H13N3SSe 454.4137 5.60513478857347 0.95191087183926 90.62277586765 -5.55191087183926 2.02971670891245 -3.52219416292681 C1=CC=C(C1)c1cc2[se]c3c(c4nsnc4c4ccncc34)c2c2c...
7 196637 C1C=CC=C1c1cc2[se]c3cc4ccsc4cc3c2[se]1 C17H10SSe2 404.252 2.64443641930939 0.587932414406801 69.2234614721 -5.1879324144068 2.20110577558483 -2.98682663882197 C1=CC=C(C1)c1cc2[se]c3cc4ccsc4cc3c2[se]1
8 262174 C1C=CC=C1c1cc2[se]c3c4occc4c4cscc4c3c2[se]1 C19H10OSSe2 444.273 2.52305655873057 0.39767026257405 97.64532544975 -4.99767026257405 1.98212181531837 -3.01554844725568 C1=CC=C(C1)c1cc2[se]c3c4occc4c4cscc4c3c2[se]1
9 393249 C1C=CC=C1c1cc2[se]c3cc4cccnc4cc3c2c2ccccc12 C24H15NSe 396.3495 3.1158951050846 0.86913959183236 55.174814587685 -5.46913959183236 2.33181477476568 -3.13732481706668 C1=CC=C(C1)c1cc2[se]c3cc4cccnc4cc3c2c2ccccc12
10 35 C1C2=C([SiH2]C=C2)C=C1c1cc2occc2c2cscc12 C17H12OSSi 292.4328 2.74321377891055 0.38710624740493 109.06290475405 -4.98710624740493 1.90996574187542 -3.07714050552951 C1=CC2=C([SiH2]1)C=C(C2)c1cc2occc2c2cscc12
11 1048612 C1C=CC=C1C1=Cc2sc3cc4C=C[SiH2]c4cc3c2C1 C18H14SSi 290.4606 2.40841131373757 0.43131491941631 85.9377076701 -5.03131491941631 2.06584966433715 -2.96546525507916 C1=CC=C(C1)C1=Cc2sc3cc4C=C[SiH2]c4cc3c2C1
12 917542 C1C=c2ccc3[se]c4c5[se]c(cc5[se]c4c3c2=C1)C1=CC... C20H12Se3 489.1948 2.84327790532769 0.3025906196108 144.6143656087 -4.9025906196108 1.70819762918304 -3.19439299042776 C1=CC=C(C1)c1cc2[se]c3c([se]c4ccc5=CCC=c5c34)c...
13 1441831 C1C=CC=C1C1=Cc2ncc3c4[se]ccc4cnc3c2C1 C18H12N2Se 335.2668 2.68724019638341 0.67549682028117 61.225277938305 -5.27549682028117 2.27095328753055 -3.00454353275062 C1=CC=C(C1)C1=Cc2ncc3c4[se]ccc4cnc3c2C1
14 1376296 C1C=CC=C1C1=Cc2c(C1)c1[se]c3ccc4cscc4c3c1c1=C[... C24H16SSeSi 443.5024 2.8446368983132 0.18920592502927 231.38739350415 -4.78920592502927 1.31233370868624 -3.47687221634303 C1=CC=C(C1)C1=Cc2c(C1)c1[se]c3ccc4cscc4c3c1c1=...
15 1638442 C1C=c2ccc3cnc4c5[SiH2]C(=Cc5c5nsnc5c4c3c2=C1)C... C23H15N3SSi 393.5445 6.46251246238048 0.60240460581576 165.1051792767 -5.20240460581576 1.60316496595707 -3.59923963985869 C1=CC=C(C1)C1=Cc2c([SiH2]1)c1ncc3ccc4=CCC=c4c3...
16 98350 C1C=CC=C1C1=Cc2ccc3c4CC=Cc4c4cscc4c3c2[SiH2]1 C22H16SSi 340.5204 2.63146328874209 0.410851163619401 98.57354638625 -5.0108511636194 1.97570703051256 -3.03514413310684 C1=CC=C(C1)C1=Cc2ccc3c4CC=Cc4c4cscc4c3c2[SiH2]1
17 2162747 C1C=CC=C1C1=Cc2c([SiH2]1)c1c3c[nH]cc3c3ccc4=C[... C27H19NOSi2 429.6251 2.03915811352424 0.14074406290405 222.981280483 -4.74074406290405 1.36113723091331 -3.37960683199074 C1=CC=C(C1)C1=Cc2c([SiH2]1)c1c3c[nH]cc3c3ccc4=...
18 557119 C1C=c2c3C=C(Cc3c3occc3c2=C1)C1=CC=CC1 C19H14O 258.3186 0.237204563447386 0.0249623237532005 146.24654523115 -4.6249623237532 1.70041519990913 -2.92454712384407 C1=CC=C(C1)C1=Cc2c(C1)c1occc1c1=CCC=c21
19 753728 C1C=CC=C1C1=Cc2c([SiH2]1)c1cc3ncccc3cc1c1c[nH]... C22H16N2Si 336.4684 3.10383123118601 0.409504148061471 116.65070843205 -5.00950414806147 1.8634156621733 -3.14608848588817 C1=CC=C(C1)C1=Cc2c([SiH2]1)c1cc3ncccc3cc1c1c[n...
20 819265 C1C=CC=C1C1=Cc2c([SiH2]1)c1c(c3cscc23)c2[se]cc... C23H16SSeSi2 459.5774 5.38525291629117 0.368606419249421 224.8489157226 -4.96860641924942 1.35230882836041 -3.61629759088901 C1=CC=C(C1)C1=Cc2c([SiH2]1)c1c(c3cscc23)c2[se]...
21 1278019 C1C=CC=C1C1=Cc2c([SiH2]1)c1c(c3[SiH2]C=Cc3c3=C... C23H18OSi3 394.6522 5.48948942078778 0.30124157478828 280.45593203485 -4.90124157478828 1.13561905617574 -3.76562251861254 C1=CC=C(C1)C1=Cc2c([SiH2]1)c1c(c3[SiH2]C=Cc3c3...
22 2096063 C1C=CC=C1c1cc2[se]c3c(c2c2cscc12)c1ccccc1c1ccc... C27H14N2S2Se 509.5136 6.20409348575883 0.570054683857091 167.49791375135 -5.17005468385709 1.59307772190982 -3.57697696194727 C1=CC=C(C1)c1cc2[se]c3c(c2c2cscc12)c1ccccc1c1c...
23 2752585 C1C=CC=C1C1=Cc2c(C1)c1c(c3c[nH]cc23)c2c3c[nH]c... C28H20N2Si 412.566 0 0 198.7499142499 -4.49944701123171 1.45720787683099 -3.04223913440072 C1=CC=C(C1)C1=Cc2c(C1)c1c(c3c[nH]cc23)c2c3c[nH...
24 1572945 C1C=CC=C1C1=Cc2[se]c3c4sccc4c4ccccc4c3c2C1 C22H14SSe 389.3786 2.16725162664095 0.33062319718781 100.8843043156 -4.93062319718781 1.96125330337018 -2.96936989381763 C1=CC=C(C1)C1=Cc2[se]c3c4sccc4c4ccccc4c3c2C1
25 2359381 C1C=CC=C1C1=Cc2c(C1)c1c3cscc3c3ccc4nsnc4c3c1c1... C26H14N2OS2 434.5416 4.11298236915351 0.29954882002972 211.3181606601 -4.89954882002972 1.40922933873152 -3.4903194812982 C1=CC=C(C1)C1=Cc2c(C1)c1c3cscc3c3ccc4nsnc4c3c1...
26 1540183 C1C=CC=C1c1cc2[se]c3c([se]c4ccc5cscc5c34)c2cn1 C20H11NSSe2 455.2999 3.21256529851421 0.68356751158616 72.32994545335 -5.28356751158616 2.17471153849587 -3.10885597309029 C1=CC=C(C1)c1cc2[se]c3c([se]c4ccc5cscc5c34)c2cn1
27 1638500 C1C=CC=C1c1cc2[se]c3ccc4ccccc4c3c2c2cocc12 C23H14OSe 385.3226 3.08884387902497 0.48226213429058 98.57354638625 -5.08226213429058 1.97723538425119 -3.10502675003939 C1=CC=C(C1)c1cc2[se]c3ccc4ccccc4c3c2c2cocc12
28 2621542 C1C=c2c3ccccc3c3c4ccccc4c4C=C(Cc4c3c2=C1)C1=CC... C29H20 368.477 2.55288566054067 0.34111463011233 115.18040593475 -4.94111463011233 1.87275914683688 -3.06835548327545 C1=CC=C(C1)C1=Cc2c(C1)c1c(c3ccccc23)c2ccccc2c2...
29 98411 C1C=CC=C1c1cc2[se]c3cc4cccnc4cc3c2c2cscc12 C22H13NSSe 402.3777 4.24735634211112 0.65395973081254 99.9574755777 -5.25395973081254 1.96724506127822 -3.28671466953432 C1=CC=C(C1)c1cc2[se]c3cc4cccnc4cc3c2c2cscc12
... ... ... ... ... ... ... ... ... ... ... ...
2322820 2705444 [SiH2]1ccc2csc(c12)-c1sc(c2[SiH2]ccc12)-c1ccc(... C25H17NS3Si2 483.786 2.97681 0.892533 51.3304 -5.49253 2.37349 -3.11904 c1sc(c2[SiH2]ccc12)-c1sc(c2[SiH2]ccc12)-c1ccc(...
2322821 2925216 [SiH2]1ccc2csc(c12)-c1sc(-c2sc(-c3scc4occc34)c... C24H12O2S5Si 520.773 3.68731 0.323482 175.432 -4.92348 1.55837 -3.36511 c1sc(c2[SiH2]ccc12)-c1sc(-c2sc(-c3scc4occc34)c...
2322822 2742210 [SiH2]1ccc2csc(c12)-c1sc(-c2sc(-c3scc4ccoc34)c... C24H12O2S5Si 520.773 3.03641 0.280599 166.541 -4.8806 1.59642 -3.28418 c1sc(c2[SiH2]ccc12)-c1sc(-c2sc(-c3scc4ccoc34)c...
2322823 3092419 [SiH2]1ccc2csc(c12)-c1sc(c2[SiH2]ccc12)-c1ccc(... C23H15N3S3Si2 485.762 5.76643 1.00011 88.7372 -5.60011 2.04536 -3.55475 c1sc(c2[SiH2]ccc12)-c1sc(c2[SiH2]ccc12)-c1ccc(...
2322824 1253317 [SiH2]1ccc2csc(c12)-c1sc(c2[SiH2]ccc12)-c1ccc(... C23H17NS2Si2 427.698 2.56918 1.02184 38.6953 -5.62184 2.52339 -3.09845 c1sc(c2[SiH2]ccc12)-c1sc(c2[SiH2]ccc12)-c1ccc(...
2322825 1841096 [SiH2]1ccc2csc(c12)-c1sc(c2[SiH2]ccc12)-c1ccc(... C25H17NOS2Si2 467.719 3.65147 0.838712 67.0043 -5.43871 2.22052 -3.21819 c1sc(c2[SiH2]ccc12)-c1sc(c2[SiH2]ccc12)-c1ccc(...
2322826 2770889 C1ccc2c1c(sc2-c1scc2cc[SiH2]c12)-c1ccc(-c2cccc... C26H17NS3Si 467.711 3.2944 0.667854 75.9176 -5.26785 2.14341 -3.12444 c1sc(c2[SiH2]ccc12)-c1sc(c2Cccc12)-c1ccc(-c2cc...
2322827 1816522 C1ccc2c1c(sc2-c1scc2cc[SiH2]c12)-c1sc(-c2ccccc... C25H16S4Si 472.751 3.29743 0.473489 107.18 -5.07349 1.92114 -3.15235 c1sc(c2[SiH2]ccc12)-c1sc(c2Cccc12)-c1sc(-c2ccc...
2322828 1810382 [SiH2]1ccc2csc(c12)-c1sc(c2[SiH2]ccc12)-c1ccc(... C25H17NOS2Si2 467.719 3.58162 0.762095 72.3299 -5.3621 2.17184 -3.19025 c1sc(c2[SiH2]ccc12)-c1sc(c2[SiH2]ccc12)-c1ccc(...
2322829 1648591 [SiH2]1ccc2csc(c12)-c1sc(-c2sc(-c3scc4ccoc34)c... C24H12O3S4Si 504.706 2.78056 0.264955 161.513 -4.86495 1.61888 -3.24608 c1sc(c2[SiH2]ccc12)-c1sc(-c2sc(-c3scc4ccoc34)c...
2322830 2705360 [SiH2]1ccc2csc(c12)-c1sc(-c2sc(-c3scc4ccoc34)c... C24H13NO2S4Si 503.722 1.0633 0.0871941 187.68 -4.68719 1.50298 -3.18421 c1sc(c2[SiH2]ccc12)-c1sc(-c2sc(-c3scc4ccoc34)c...
2322831 2349009 C1ccc2csc(c12)-c1ccc(cn1)-c1sc(-c2scc3cc[SiH2]... C24H17NS3Si2 471.775 2.8029 0.911719 47.3144 -5.51172 2.42118 -3.09054 c1sc(c2[SiH2]ccc12)-c1sc(c2[SiH2]ccc12)-c1ccc(...
2322832 3091107 [SiH2]1ccc2csc(c12)-c1sc(-c2sc(-c3scc4ccsc34)c... C24H14OS5Si2 534.876 3.77035 0.412894 140.537 -5.01289 1.73206 -3.28083 c1sc(c2[SiH2]ccc12)-c1sc(-c2sc(-c3scc4ccsc34)c...
2322833 8152 [SiH2]1ccc2csc(c12)-c1sc(-c2scc3cc[se]c23)c2[s... C18H10S3Se2Si 508.481 2.88742 0.549016 80.9417 -5.14902 2.10191 -3.0471 c1sc(c2[SiH2]ccc12)-c1sc(-c2scc3cc[se]c23)c2[s...
2322834 1781722 [SiH2]1ccc2csc(c12)-c1sc(c2[SiH2]ccc12)-c1ccc(... C23H16N2S3Si2 472.763 2.81402 0.556938 77.7621 -5.15694 2.1271 -3.02984 c1sc(c2[SiH2]ccc12)-c1sc(c2[SiH2]ccc12)-c1ccc(...
2322835 2470223 [SiH2]1ccc2csc(c12)-c1sc(-c2sc(-c3scc4sccc34)c... C24H13NS6Si 535.856 2.44574 0.20756 181.349 -4.80756 1.5331 -3.27446 c1sc(c2[SiH2]ccc12)-c1sc(-c2sc(-c3scc4sccc34)c...
2322836 2469856 C1ccc2c1c(sc2-c1sc(-c2scc3cc[SiH2]c23)c2ccoc12... C25H15NOS4Si 501.75 2.14342 0.22746 145.027 -4.82746 1.70726 -3.1202 c1sc(c2[SiH2]ccc12)-c1sc(-c2sc(c3Cccc23)-c2scc...
2322837 1912803 [SiH2]1ccc2csc(c12)-c1sc(-c2sc(-c3scc4ccoc34)c... C24H12O3S4Si 504.706 2.6569 0.274521 148.952 -4.87452 1.68676 -3.18776 c1sc(c2[SiH2]ccc12)-c1sc(-c2sc(-c3scc4ccoc34)c...
2322838 1216485 [SiH2]1ccc2csc(c12)-c1sc(c2[SiH2]ccc12)-c1cccc... C18H12N2S3Si2 408.677 7.59421 0.993521 117.64 -5.59352 1.85748 -3.73604 c1sc(c2[SiH2]ccc12)-c1sc(c2[SiH2]ccc12)-c1cccc...
2322839 2619366 C1cc2c(ccc(-c3ccccc3)c2c1)-c1sc(-c2scc3cc[SiH2... C28H20S2Si 448.684 3.74322 0.466049 123.612 -5.06605 1.824 -3.24204 c1sc(c2[SiH2]ccc12)-c1sc(c2Cccc12)-c1ccc(-c2cc...
2322840 1703911 C1cc2c(ccc(-c3cccnc3)c2c1)-c1sc(-c2scc3cc[SiH2... C26H19NS2Si2 465.747 4.88105 0.657693 114.219 -5.25769 1.87628 -3.38141 c1sc(c2[SiH2]ccc12)-c1sc(c2[SiH2]ccc12)-c1ccc(...
2322841 1814506 [SiH2]1ccc2csc(c12)-c1sc(-c2sc(c3[SiH2]ccc23)-... C23H16N2S3Si2 472.763 3.35318 0.461167 111.904 -5.06117 1.892 -3.16917 c1sc(c2[SiH2]ccc12)-c1sc(-c2sc(c3[SiH2]ccc23)-...
2322842 2559314 [SiH2]1ccc2csc(c12)-c1sc(-c2sc(c3[SiH2]ccc23)-... C23H15NOS3Si2 473.748 4.26338 0.688326 95.3251 -5.28833 1.99871 -3.28961 c1sc(c2[SiH2]ccc12)-c1sc(-c2sc(c3[SiH2]ccc23)-...
2322843 2351086 [SiH2]1ccc2csc(c12)-c1sc(c2[SiH2]ccc12)-c1ccc(... C24H16N2S3Si2 484.774 6.66266 0.85006 120.627 -5.45006 1.83969 -3.61037 c1sc(c2[SiH2]ccc12)-c1sc(c2[SiH2]ccc12)-c1ccc(...
2322844 1712111 [SiH2]1ccc2csc(c12)-c1sc(-c2sc(-c3scc4ccsc34)c... C24H12OS6Si 536.84 2.95171 0.279912 162.293 -4.87991 1.61514 -3.26477 c1sc(c2[SiH2]ccc12)-c1sc(-c2sc(-c3scc4ccsc34)c...
2322845 2543603 [SiH2]1ccc2csc(c12)-c1sc(c2[SiH2]ccc12)-c1cnc(... C22H14N4S3Si2 486.751 0 0 0 -5.63251 1.45408 -4.17843 c1sc(c2[SiH2]ccc12)-c1sc(c2[SiH2]ccc12)-c1cnc(...
2322846 2304057 [SiH2]1ccc2csc(c12)-c1sc(c2[SiH2]ccc12)-c1ccc(... C22H14N4S3Si2 486.751 9.33549 1.12074 128.197 -5.72074 1.7986 -3.92214 c1sc(c2[SiH2]ccc12)-c1sc(c2[SiH2]ccc12)-c1ccc(...
2322847 2007035 [SiH2]1ccc2csc(c12)-c1sc(c2[SiH2]ccc12)-c1ccc(... C26H18S3Si2 482.798 2.49821 0.834995 46.0461 -5.435 2.43316 -3.00184 c1sc(c2[SiH2]ccc12)-c1sc(c2[SiH2]ccc12)-c1ccc(...
2322848 1961981 C1ccc2c1c(sc2-c1scc2cc[SiH2]c12)-c1ccc(cc1)-c1... C25H16S3SeSi 519.645 2.67907 0.659243 62.544 -5.25924 2.25847 -3.00077 c1sc(c2[SiH2]ccc12)-c1sc(c2Cccc12)-c1ccc(cc1)-...
2322849 2754558 [SiH2]1ccc2csc(c12)-c1sc(-c2sc(-c3scc4ccsc34)c... C24H13NOS5Si 519.789 1.2724 0.102802 190.49 -4.7028 1.49095 -3.21185 c1sc(c2[SiH2]ccc12)-c1sc(-c2sc(-c3scc4ccsc34)c...

2322850 rows × 11 columns


In [ ]:


In [4]:
def typ(x,y):
    sol = x.mean() + y.mean()
    return sol

In [7]:
typ(df['mass'],df['pce'])


Out[7]:
419.48866167769404

In [ ]: