The print function prints output to the command-line interface (terminal in OS X/Linux, powershell in Windows). In the following example, we print the text string "Hello, world!"
In [9]:
print("Hello, world!")
Take the following statement, x = 33. In this statement, x is a variable with a value of 33 (integer literal). As its name suggests, variables can be reassigned when executing a program.
Simple data types in Python include: integers, floating point numbers, strings and Boolean (True/False) values. In the following examples, the variables Ag and Au are assigned to integer values. The variables are then reassigned to floating point numbers and some simple math operations demonstrated.
Integers are whole numbers, i.e., 1, 0, -20 etc.
In [10]:
Ag = 107
In [11]:
Au = 197
In [12]:
type(Ag)
Out[12]:
In [13]:
type(Au)
Out[13]:
Floats use a decimal point or exponential notation. Note that values such as 20.0, 1.0, 0.0 etc. are floats, not integers. Also, floating point values may not be a true representation of the number since computer use a binary (base-2) number system. This leads to floating point errors when working with floating point values.
In [14]:
Ag = 106.9
In [15]:
Au = 197.0
In [16]:
type(Ag)
Out[16]:
In [17]:
type(Au)
Out[17]:
In Python 3, math operations behave as you would expect. In Python 2, division is the same as floor division when working with integers!
In [64]:
Au + Ag # addition (Note: You can add comments to your code by using the # symbol)
Out[64]:
In [19]:
Au - Ag # subtraction
Out[19]:
In [20]:
Au * 5 # multiplication
Out[20]:
In [21]:
Ag ** 2 # exponential - mass of silver squared in this case
Out[21]:
In [22]:
Au / Ag # division
Out[22]:
In [23]:
Au // Ag # floor division
Out[23]:
We can convert a float to an integer or an integer to a float. Conversions of string representations of numbers to actual numbers (integers and floats) is also common in Python programming.
In [24]:
integer_Ag = int(Ag)
In [25]:
integer_Ag
Out[25]:
In [26]:
type(integer_Ag)
Out[26]:
The modulo (%) operator gives the remainder from division. This is commonly used to check whether a number is odd or even. In mass spectrometry, we could use this to test whether an ion has an odd number of nitrogen atoms. For example, mz = 114; if mz % 2 == 0:; print("Ion has an odd number of nitrogens")
In [27]:
modulo = integer_Ag % 2
In [28]:
if modulo == 0:
print("Ion is even")
In the following example, we have a file name represented as a string. From this string, we can use to indexing to select certain characters or substrings from the file name string.
String are simply strings of characters (letters, numbers, symbols etc.). Strings are indicated using either single quotes ('This is a string') or double quotes ("This is another string"). Multiline strings can be made using triple quotes (i.e. """Multiline string""").
In [29]:
MS2_spectrum = "Liver_MS2_406.raw"
In [30]:
MS2_spectrum
Out[30]:
Single characters are indexed using square brackets after the variable name. In Python, the first character is 0. Characters may also be index from the end of the string (starting at -1).
In [31]:
MS2_spectrum[0]
Out[31]:
In [32]:
MS2_spectrum[-1]
Out[32]:
In [33]:
sample = MS2_spectrum[0:5]
In [34]:
sample # Note: The character at position 5 is not included in the substring.
Out[34]:
In [35]:
ion = MS2_spectrum[10:13]
In [36]:
ion
Out[36]:
In [37]:
file_format = MS2_spectrum[-3:]
In [38]:
file_format
Out[38]:
In [39]:
type(ion)
Out[39]:
In [40]:
float(ion)
Out[40]:
Lists are mutable (i.e., modifiable), ordered collections of items. Lists are created by enclosing a collection of items with square brackets. An empty list may also be created simply by assigning [] to a variable, i.e., empty_list = [].
In [41]:
MS_files = ["MS_spectrum", "MS2_405", "MS2_471", "MS2_495"]
In [42]:
MS_files
Out[42]:
In [43]:
MS_files[2]
Out[43]:
In [44]:
MS_files.remove("MS2_405")
In [45]:
MS_files
Out[45]:
In [46]:
MS_files.append("MS3_225")
In [47]:
MS_files
Out[47]:
Tuples are immutable (i.e., can't modified after their creation), ordered collections of items and are the simplist collection data type. Tuples are created by enclosing a collection of items by parentheses).
In [48]:
Fe_isotopes = (53.9, 55.9, 56.9, 57.9)
In [49]:
Fe_isotopes
Out[49]:
In [50]:
Fe_isotopes[0]
Out[50]:
Dictionaries are mutable, unordered collections of key: value pairs. Dictionaries are created created by enclosing key: value pairs with curly brackets. Importantly, keys must be hashable. This means, for example, that lists can't be used as keys since the items inside a list may be modified.
In [51]:
carbon_isotopes = {"12": 0.9893, "13": 0.0107}
In [52]:
carbon_isotopes["12"]
Out[52]:
In [53]:
carbon_isotopes.keys()
Out[53]:
In [54]:
carbon_isotopes.values()
Out[54]:
In [55]:
carbon_isotopes.items()
Out[55]:
Sets are another data type which are like an unordered list with no dublicates. They are especially useful for finding all the unique items from a list as shown below.
In [56]:
phospholipids = ["PA(16:0/18:1)", "PA(16:0/18:2)", "PC(14:0/16:0)", "PC(16:0/16:1)", "PC(16:1/16:2)"]
# Lets assume we apply a function that finds the type of phospholipid name to
phospholipid_fatty_acids = ["16:0", "18:1", "16:0", "18:2", "14:0", "16:0", "16:0", "16:1", "16:1", "16:2"]
In [57]:
unique_fatty_acids = set(phospholipid_fatty_acids)
In [58]:
unique_fatty_acids
Out[58]:
In [59]:
num_unique_fa = len(unique_fatty_acids)
In [60]:
num_unique_fa
Out[60]:
Boolean operators asses the truth or falseness of a statement.
In [61]:
Ag > Au
Out[61]:
In [62]:
Ag < Au
Out[62]:
In [63]:
Ag == 106.9
Out[63]:
In [48]:
Au >= 100
Out[48]:
In [49]:
Ag <= Au and Ag > 200
Out[49]:
In [50]:
Ag <= Au or Ag > 200
Out[50]:
Code is only executed if the conditional statement is evaluated as True. In the following example, Ag has a value of greater than 100 and therefore only the "Ag is greater than 100 Da" string is printed. A colon follows the conditional statement and the following code block is indented by 4 spaces (always use 4 spaces rather than tabs - errors will resulting when mixing tabs with spaces!). Note, the elif and else statements are optional.
In [51]:
if Ag < 100:
print("Ag is less than 100 Da")
elif Ag > 100:
print("Ag is greater than 100 Da.")
else:
print("Ag is equal to 100 Da.")
While loops repeat the execution of a code block while a condition is evaulated as True. When using while loops, be careful not to make an infinite loop where the conditional statement never evaluates as False. (Note: You could, however, use 'break' to break from an infinite loop.)
In [52]:
mass_spectrometers = 0
while mass_spectrometers < 5:
print("Ask for money")
mass_spectrometers = mass_spectrometers + 1
# Comment: This can be written as mass_spectrometers += 1
print("Number of mass spectrometers equals", mass_spectrometers)
print("\nNow we need more lab space")
For loops iterate over each item of collection data types (lists, tuples, dictionaries and sets). For loops can also be used to loop over the characters of a string. In fact, this fact will be utilised later to evaluate each amino acid residue of a peptide string.
In [36]:
lipid_masses = [674.5, 688.6, 690.6, 745.7]
In [37]:
Na = 23.0
lipid_Na_adducts = []
for mass in lipid_masses:
lipid_Na_adducts.append(mass + Na)
In [38]:
lipid_Na_adducts
Out[38]:
The following is a list comprehension which performs the same operation of the for loop above but in less lines of code.
In [39]:
adducts_comp = [mass + Na for mass in lipid_masses]
In [40]:
adducts_comp
Out[40]:
We could also add a predicate to a list comprehension. Here, we calculate the mass of lipids less than 700 Da.
In [43]:
adducts_comp = [mass + Na for mass in lipid_masses if mass < 700]
In [44]:
adducts_comp
Out[44]:
Both while and for loops can be combined with conditional statements for greater control of flow within a program.
In [11]:
mass_spectrometers = 0
while mass_spectrometers < 5:
mass_spectrometers += 1
print("Number of mass spectrometers equals", mass_spectrometers)
if mass_spectrometers == 1:
print("Woohoo, the first of many!")
elif mass_spectrometers == 5:
print("That'll do for now.")
else:
print("More!!")
In [58]:
for MS_file in MS_files:
if "spectrum" in MS_file:
print("MS file:", MS_file)
elif "MS2" in MS_file:
print("MS2 file:", MS_file)
else:
print("MS3 file:", MS_file)
In the following example, we will calculate the mass of a peptide from a string containing one letter amino acid residue codes. For example, peptide = "GASPV". To do this, we will first need a dictionary containing the one letter codes as keys and the masses of the amino acid residues as values. We will then need to create a variable to store the mass of the peptide and use a for loop to iterate over each amino acid residue in the peptide.
In [6]:
amino_dict = {
'G': 57.02147,
'A': 71.03712,
'S': 87.03203,
'P': 97.05277,
'V': 99.06842,
'T': 101.04768,
'C': 103.00919,
'I': 113.08407,
'L': 113.08407,
'N': 114.04293,
'D': 115.02695,
'Q': 128.05858,
'K': 128.09497,
'E': 129.0426,
'M': 131.04049,
'H': 137.05891,
'F': 147.06842,
'R': 156.10112,
'Y': 163.06333,
'W': 186.07932,
}
# Data modified from http://www.its.caltech.edu/~ppmal/sample_prep/work3.html
In [7]:
peptide_name = "SCIENCE"
In [8]:
mass = 18.010565
for amino_acid in peptide_name:
mass += amino_dict[amino_acid]
In [9]:
mass
Out[9]:
Functions perform a specified task when called during the execution of a program. Functions reduce the amount of code that needs to be written and greatly improves code readability. (Note: readability matters!) The for loop created above is better placed in a function so that the for loop doesn't need to be re-written everytime we wish to calculate the mass of a peptide. Pay careful attention to the syntax below.
In [10]:
def peptide_mass(peptide):
mass = 18.010565
for amino_acid in peptide:
mass += amino_dict[amino_acid]
return mass
In [11]:
peptide_mass(peptide_name)
Out[11]:
A simple means to gather user inputted data is to use input. This will prompt the user to enter data which may be used within the program. In the example below, we prompt the user to enter a peptide name. The peptide name is then used for the function call to calculate the peptide's mass.
In [14]:
user_peptide = input("Enter peptide name: ")
In [15]:
peptide_mass(user_peptide)
Out[15]: