"During the past century, science has developed a limited capability to design materials, but we are still too dependent on serendipity" - Eberhart and Clougherty, Looking for design in materials design (2004)
This practical explores how materials design can be approached by using the simplest of rules in order to narrow down the combinations to those that might be considered legitimate. It will demonstrate the scale of the problem, even after some chemical rules are applied.
In [*]
it is still running!(From the blog of Anubhav Jain: www.hackingmaterials.com)
The answer is about $10^{108}$, over a googol of compounds!
TASK: Use the cell below to arrive at the conclusion above. Hints for the formula required are below the cell.
In [ ]:
from math import factorial as factorial
grid_points = 1000.0
atoms = 30.0
elements = 50.0
##########
# A. Show that assigning each of the 30 atoms as one of 50 elements is ~ 9e50 (permutations)
element_assignment = 0
print(f'Number of possible element assignments is: {element_assignment}')
# B. Show that the number of possible arrangements of 30 atoms on a grid of 10x10x10 is ~2e57 (combinations)
atom_arrangements = 0
print(f'Number of atom arrangements is: {atom_arrangements}')
# C. Finally, show that the total number of potential "materials" is ~ 2e108
total_materials = 0
print(f'Total number of "materials" is: {total_materials}')
We will use well-known elemental properties along with the criterion that compounds must not have an overall charge in order to sequentially apply different levels of screening and count the possible combinations:
i. Setting up the search space - Defining which elements we want to include
ii. Element combination counting - Considering combinations of elements and ignore oxidation states
iii. Ion combination counting - Considering combinations of elements in their allowed oxidation states
iv. Charge neutrality - Discarding any combinations that would not make a charge neutral compound
v. Electronegativity - Discarding any combinations which exhibit a cation which is more electronegative than an anion
The code below imports the element data that we need in order to do our counting. The main variable in the cell below for this practical is the max_atomic_number
which dictates how many elements to consider.
For example, when max_atomic_number = 10
the elements from H to Ne are considered in the search.
max_atomic_number
so that it includes elements from H to Ar
In [ ]:
# Imports the SMACT toolkit for later on #
import smact
# Gets element data from file and puts into a list #
with open('Counting/element_data.txt','r') as f:
data = f.readlines()
list_of_elements = []
# Specify the range of elements to include #
### EDIT BELOW ###
max_atomic_number = 10
##################
# Populates a list with the elements we are concerned with #
for line in data:
if not line.startswith('#'):
# Grab first three items from table row
symbol, name, Z = line.split()[:3]
if int(Z) > 0 and int(Z) < max_atomic_number + 1:
list_of_elements.append(symbol)
print(f'--- Considering the {len(list_of_elements)} elements '
f'from {list_of_elements[0]} to {list_of_elements[-1]} ---')
This first procedure simply counts how many binary combinations are possible for a given set of elements. This is a numerical (combinations) problem, as we are not considering element properties in any way for the time being.
for k, ele_c...
) to make the cell count up ternary combinations. It is advisable to change the number of elements to include back to 10 first! Hint: The next exercise is set up for ternary counting so you could come back and do this after looking at that.
In [ ]:
# Counts up possibilities and prints the output #
element_count = 0
for i, ele_a in enumerate(list_of_elements):
for j, ele_b in enumerate(list_of_elements[i+1:]):
element_count += 1
print(f'{ele_a} {ele_b}')
# Prints the total number of combinations found
print(f'Number of combinations = {element_count}')
We now consider each known oxidation state of an element (so strictly speaking we are not dealing with 'ions'). The procedure incorporates a library of known oxidation states for each element and is this time already set up to search for ternary combinations. The code prints out the combination of elements including their oxidation states. There is also a timer so that you can see how long it takes to run the program.
max_atomic_number
again in the cell above and see how this affects the number of combinations. Hint: It is advisable to increase the search-space gradually and see how long the calculation takes. Big numbers mean you could be waiting a while for the calculation to run....
In [ ]:
# Sets up the timer to see how long the program takes to run #
import time
start_time = time.time()
ion_count = 0
for i, ele_a in enumerate(list_of_elements):
for ox_a in smact.Element(ele_a).oxidation_states:
for j, ele_b in enumerate(list_of_elements[i+1:]):
for ox_b in smact.Element(ele_b).oxidation_states:
for k, ele_c in enumerate(list_of_elements[i+j+2:]):
for ox_c in smact.Element(ele_c).oxidation_states:
ion_count += 1
print(f'{ele_a} {ox_a} \t {ele_b} {ox_b} \t {ele_c} {ox_c}')
# Prints the total number of combinations found and the time taken to run.
print(f'Number of combinations = {ion_count}')
print(f'--- {time.time() - start_time} seconds to run ---')
All we seem to have done is make matters worse!
We are introducing many more species by further splitting each element in our search-space into separate ions, one for each allowed oxidation state. When we get to max_atomic_number > 20, we are including the transition metals and their many oxidation states.
The previous step is necessary to incorporate our filter that viable compounds must be charge neutral overall. Scrolling through the output from above, it is easy to see that the vast majority of the combinations are not charge neutral overall. We can discard these combinations to start narrowing our search down to more 'sensible' (or at least not totally unreasonable) ones. In this cell, we will use the neutral_ratios
function in smact to do this.
max_atomic_number
in the cell above) and compare the output of i. and ii. with that of the below cell
In [ ]:
import time
from smact import neutral_ratios
start_time = time.time()
charge_neutral_count = 0
for i, ele_a in enumerate(list_of_elements):
for ox_a in smact.Element(ele_a).oxidation_states:
for j, ele_b in enumerate(list_of_elements[i+1:]):
for ox_b in smact.Element(ele_b).oxidation_states:
for k, ele_c in enumerate(list_of_elements[i+j+2:]):
for ox_c in smact.Element(ele_c).oxidation_states:
# Checks if the combination is charge neutral before printing it out! #
cn_e, cn_r = neutral_ratios([ox_a, ox_b, ox_c], threshold=1)
if cn_e:
charge_neutral_count += 1
print(f'{ele_a} \t {ele_b} \t {ele_c}')
print(f'Number of combinations = {charge_neutral_count}')
print(f'--- {time.time() - start_time} seconds to run ---')
This drastically reduces the number of combinations we get out and we can even begin to see some compounds that we recognise and know exist.
The last step is to incorporate the key chemical property of electronegativity, i.e. the propensity of an element to attract electron density to itself in a bond. This is a logical step as inspection of the output from above reveals that some combinations feature a species in a higher (more positive) oxidation state which is more elecronegative than other species present.
With this in mind, we now incorporate another filter which checks that the species with higher oxidation states have lower electronegativities. The library of values used is of the widely accepted electronegativity scale as developed by Linus Pauling. The scale is based on the dissociation energies of heteronuclear diatomic molecules and their corresponding homonuclear diatomic molecules:
In [ ]:
import time
from smact.screening import pauling_test
start_time = time.time()
pauling_count = 0
for i, ele_a in enumerate(list_of_elements):
paul_a = smact.Element(ele_a).pauling_eneg
for ox_a in smact.Element(ele_a).oxidation_states:
for j, ele_b in enumerate(list_of_elements[i+1:]):
paul_b = smact.Element(ele_b).pauling_eneg
for ox_b in smact.Element(ele_b).oxidation_states:
for k, ele_c in enumerate(list_of_elements[i+j+2:]):
paul_c = smact.Element(ele_c).pauling_eneg
for ox_c in smact.Element(ele_c).oxidation_states:
# Puts elements, oxidation states and electronegativites into lists for convenience #
elements = [ele_a, ele_b, ele_c]
oxidation_states = [ox_a, ox_b, ox_c]
pauling_electro = [paul_a, paul_b, paul_c]
# Checks if the electronegativity makes sense and if the combination is charge neutral #
electroneg_makes_sense = pauling_test(oxidation_states, pauling_electro, elements)
cn_e, cn_r = smact.neutral_ratios([ox_a, ox_b, ox_c], threshold=1)
if cn_e:
if electroneg_makes_sense:
pauling_count += 1
print(f'{ele_a}{ox_a} \t {ele_b}{ox_b} \t {ele_c}{ox_c}')
print(f'Number of combinations = {pauling_count}')
print(f'--- {time.time() - start_time} seconds to run ---')
For a given search-space of elements, the number of combinations in the output has decreased each time we've applied a filter. However, the time taken to carry out the calculation has increased. This highlights a fundamental trade-off, however there are some clever coding techniqes that can be used to considerably speed things up.
By employing multi-threading and reworking the above code to reduce the number of times it has to look up element properties from other files, the time taken to carry out the ternary count including checks for charge neutrality and electronegativity for 103 elements was reduced from ~ 1 hour to just 26 seconds (carried out on the same workstation)!
Furthermore, a quaternary count for 103 atoms took only 2 hours. This was reducecd to 40 minutes using a 16-core workstation. The code used to carry out these calculations is available in the examples folder of the main smact repository.
N.B. For the numbers quoted above, the stoichiometry of each site was also allowed to vary between 1 and 8 which significantly increases the number of combinations (see below).
A note on multiplicity: In this exercise, we have not considered higher multiplicies (stoichiometries) in our combinations at all, i.e. we have considered only AB, ABC (and ABCD for quaternary) type combinations. When extended to AxByCz... where x,y,z > 1, the numbers involved get considerably larger still. This can be adjusted by setting threshold
in the charge neutrality
function to > 1 in the cells above. The threshold defines the maximum values of x,y,z... . If this is changed, the sum of the oxidation states printed will not always sum to zero, however some multiples (between 1 and threshold
) of them always will.
Finally, some wisdom from our friend, Linus: