If a cell begins with DNC: do not change it and leave the markdown there so I can expect a basic level of organization that is common to all HW (will help me with grading). This also clearly delineates the sections for me

DNC: preamble leave any general comments here and, in keeping with good practice, I suggest you load all needed modules in the preamble

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

DNC: Begin Part 1

Part 1: Data Visualization

1-1: Plotting x-y data

  • Use the HCEPDB file to create a single 4x4 composite plot (not 4 separate figures). The plots should contain the following data

    • Upper-left: PCE vs VOC
    • UR: PCE vs JCS
    • LL: E_HOMO vs VOC
    • LR: E_LUMO vs PCE
  • You should make the plots the highest quality possible and, in your judgement, ready for inclusion in a formal report or publication.

  • In the cell after you are finished making the plot add a markdown cell and add the following information

    • There are five terms above from the HCEPDB that relate to photovoltaic materials - define them as they pertain to molecules that could be used for energy conversion applications

    • Briefly explain the changes you made from the default plot and why you made them

1-2: Contour plotting

  • Use the ALA2fes.dat file to create a contour plot of the alanine dipeptide $\Phi$ vs $\Psi$ free-energy surface. Guidelines and information:
    • The energy scale in the data input file is on kJ/mol and the free-energy surface (FES) was collected at a temperature of 300K:
    • You should create a contour plot that draws contour lines spaced every kT in energy and stops drawing contours once all of the features can be clearly seen.
      • This is a slightly different visualization than what we drew in class which used shaded coloring to draw the contours
    • Annotate the cell so I can follow all the steps you are doing. The final energy plot need not be in kJ/mol (you can convert it to other energy or use units of kT if you prefer.

1-1 Plot

In [2]:
data = pd.read_csv('HCEPD_100K.csv')

In [3]:

id SMILES_str stoich_str mass pce voc jsc e_homo_alpha e_gap_alpha e_lumo_alpha tmp_smiles_str
0 655365 C1C=CC=C1c1cc2[se]c3c4occc4c4nsnc4c3c2cn1 C18H9N3OSSe 394.3151 5.161953 0.867601 91.567575 -5.467601 2.022944 -3.444656 C1=CC=C(C1)c1cc2[se]c3c4occc4c4nsnc4c3c2cn1
1 1245190 C1C=CC=C1c1cc2[se]c3c(ncc4ccccc34)c2c2=C[SiH2]... C22H15NSeSi 400.4135 5.261398 0.504824 160.401549 -5.104824 1.630750 -3.474074 C1=CC=C(C1)c1cc2[se]c3c(ncc4ccccc34)c2c2=C[SiH...
2 21847 C1C=c2ccc3c4c[nH]cc4c4c5[SiH2]C(=Cc5oc4c3c2=C1... C24H17NOSi 363.4903 0.000000 0.000000 197.474780 -4.539526 1.462158 -3.077368 C1=CC=C(C1)C1=Cc2oc3c(c2[SiH2]1)c1c[nH]cc1c1cc...
3 65553 [SiH2]1C=CC2=C1C=C([SiH2]2)C1=Cc2[se]ccc2[SiH2]1 C12H12SeSi3 319.4448 6.138294 0.630274 149.887545 -5.230274 1.682250 -3.548025 C1=CC2=C([SiH2]1)C=C([SiH2]2)C1=Cc2[se]ccc2[Si...
4 720918 C1C=c2c3ccsc3c3[se]c4cc(oc4c3c2=C1)C1=CC=CC1 C20H12OSSe 379.3398 1.991366 0.242119 126.581347 -4.842119 1.809439 -3.032680 C1=CC=C(C1)c1cc2[se]c3c4sccc4c4=CCC=c4c3c2o1

In [4]:
#create a single 4x4 composite plot
#ref: https://plot.ly/matplotlib/subplots/
fig = plt.figure()

ax1 = fig.add_subplot(221)
ax1.set_title('PCE vs VOC')

ax2 = fig.add_subplot(222)
ax2.set_title('PCE vs JSC')

ax3 = fig.add_subplot(223)
ax3.set_title('$E_{HOMO}$ vs VOC')

ax4 = fig.add_subplot(224)
ax4.set_title('$E_{LUMO}$ vs VOC')

1-1 Information

  • Five Terms:

    • PCE: Power conversion efficiency, means how much sunlight can be converted to electricity.
    • VOC: Voltage Open Circuit. The output Voltage of a photovoltaic(PV) under no load.
    • JSC: Short circuit current. The current through the solar cell when the voltage across the solar cell is zero.
    • E_Homo: Highest occupied molecular orbital.
    • E_Lumo: Lowest unoccupied molecular orbital. (HOMO–LUMO gap is the energy difference between the HOMO and LUMO.)
  • Changes I made:

    • Change figure size to clearly display all the labels and titles.
    • Change the subplot() to make a 2x2 figure.
    • Use sign "," to display each point.
    • Give all figures labels and titles.
    • Chnage the x limit and y limit to clearly show the points at 0.
    • Add grid lines.

1-2 Contour plot

In [5]:
#Read the file first.
data2 = pd.read_csv('ALA2fes.dat', delim_whitespace=True, comment='#', names=['phi','psi','file.free','der_phi','der_psi'])

In [6]:
#Take a look at the data.

phi psi file.free der_phi der_psi
0 -3.141593 -3.141593 4.838919 -20.089106 7.045216
1 -3.015929 -3.141593 2.726201 -14.287727 7.912178
2 -2.890265 -3.141593 1.471803 -6.391128 8.497256
3 -2.764602 -3.141593 1.242827 1.404448 9.011949
4 -2.638938 -3.141593 1.839741 6.653277 9.628476

In [7]:
#We should know how many columns there are before doing contour plot.

(2500, 5)

In [8]:
#Because it has 2500 columns, shape the data into 50x50 matrix.
N = 50
M = 50

X = np.reshape(data2.psi,[N,M])
Y = np.reshape(data2.phi,[N,M])
Z = np.reshape(data2['file.free']-data2['file.free'].min(),[N,M])

#I change unit from kJ/mol to kT, and it is appromately to time 2.5, so I pick this as spacer.
#Levels should contain all the FES, so I pick lines=42.
spacer = 2.5
lines = 42

levels = np.linspace(0,lines*spacer,num=(lines+1),endpoint=True)

fig2 = plt.figure(figsize=(5,5))
axes = fig.add_subplot(111)

#Give plot title and labels.
plt.title('$\Phi$ vs $\Psi$ on free energy surface')
plt.colorbar().ax.set_ylabel('FES (kT)')

/Users/taiyupan/miniconda3/lib/python3.5/site-packages/numpy/core/fromnumeric.py:224: FutureWarning: reshape is deprecated and will raise in a subsequent release. Please use .values.reshape(...) instead
  return reshape(newshape, order=order)
<matplotlib.text.Text at 0x10ead7f28>

DNC: Begin Part 2

In [ ]:

Part 2: Molecular Visualization

2-1: Molecules

  • Use the program Avogadro to create a 3D visualization of the ALA2 molecule (hint: I showed it in one of my lecture slides and labeled the relevant angles)

  • Show the molecule frmo at least two orientations.

  • Embed the image in your notebook using markdown. Please use a local copy of the image so I can see the source file

2-2: Materials

  • Use the program Avogadro to create a 3D visualization of the unit cell of anatase titanium dioxide. If possible create it in a slab form showing several unit cells in x/y plane

  • Embed the image in a markdown cell and write out a list of concise instructions in a numbered markdown list - it should be clear enough an undergrad science major can follow the instructions and succeed.

2-1 ALA2_1

2-1 ALA2_2

2-2 Anatase Titanium Dioxide

with 3x3x3 units of anatase titanium dioxide.


  1. Open Avogadro, click File>Crystal>Oxides>TiO2 Anatase. Then you will see a unit structure of TiO2.
  2. Click View>Crystal View Options. You will see a column named "Unit Cell Repeats". Change valuse in A, B, C, which means change the number of units in x, y, z respectively.

In [ ]: