Molecular weight

We want to calculate the molecular weight of a chemical formula. So given $\mathrm{C_6H_6}$ we want the result to be $12\times 6+1\times 6 = 18$. We can use integer as an approximation to the atomic weight to ease our mental calculations. Improving that at the end is trivial.

The most straightforward way to keep the atomic weights of elements is in a dicctionary:


In [2]:
weightDict = {
'C':12,
'H':1,
'O':16,
'Cl':35
#add more if needed.
}

Parsing the molecular formula is not a trivial task that we will do later. We start by assuming that the formula has been parsed. We will start by keeping the formula as a dictionary. In a condensed formula, each element appears only once, and its subindex indicates the total amount of atoms of that element in the molecule. Ethanol is $\mathrm{C_2H_6O}$.


In [12]:
ethanol = {'C':2, 'H':6, 'O':1}
water = {'H':2, 'O':1}
HCl = {'H':1, 'Cl':1}

From that, calculate the total weight:


In [1]:
#Finish...

Now imagine we also accept formulas in an extended way, for example ethanol as $\mathrm{CH_3CH_2OH}$. In that case it makes sense that our parsing procedure returns a list of tuples such as:


In [21]:
ethanol2 = [('C',1), ('H',3), ('C',1), ('H',2), ('O',1), ('H',1)]
acetic2 = [('C',1), ('H',3), ('C',1), ('O',1), ('O',1), ('H',1)]

From that, we could also create a dictionary such as the previous one, but we can also calculate the weight directly:


In [2]:
#Finish

Parsing

Parsing the formula is not a trivial task. You have to remember the follwing:

  • Some elements have 1 letter names, others have 2. In that case the second letter is always lower-case.
  • Some numbers can be higher than 9, i.e. use 2 or more figures.
  • When the number is 1, it us usually not written.

When coding a complex situation such as this one, it makes sense to plan a strategy for all these scenarios, but start coding and testing the simple cases (1 letter per element, etc)

Python has a built-in module to work with regular expressions that could ease the parsing. But in this case the problem is simple enough that you can solve it without the use of regular expressions.

Here is a possible solution, but different (and possibly better!) approaches are surely possible:


In [3]:
#Try to do it

In [1]:
molecule = input("Write a molecule: ")


Write a molecule: C2H3

In [ ]: