Homework #2

This notebook is due on Friday, October 7th, 2016 at 11:59 p.m.. Please make sure to get started early, and come by the instructors' office hours if you have any questions. Office hours and locations can be found in the course syllabus. IMPORTANT: While it's fine if you talk to other people in class about this homework - and in fact we encourage it! - you are responsible for creating the solutions for this homework on your own, and each student must submit their own homework assignment.

Some links that you may find helpful:

Your name

Put your name here!


Section 1: Radioactivity wrapup

In this part of the homework, we're going to finish what we started regarding modeling the count rate of radioactive data that you worked wtih in class, to try to estimate the strength of the radioactive background that was seen in the radioactive count rates.

In class, we discussed that for radioactive material with an initial amount $N_0$ and a half life $t_{1/2}$, the amount left after time t is $N(t) = N_0 2^{-t/t_{1/2}}$. The expected radioactive decay rate is then:

$\mathrm{CR}(t) = - \frac{dN}{dt} = \frac{N_0 \ln 2}{t_{1/2}}2^{-t/t_{1/2}}$

However, the data doesn't agree well with this - there's something contaminating our count rate data that's causing a radioactive "background" that is approximately constant with time. A better estimate of the count rates is more like:

$\mathrm{CR}(t) = \mathrm{CR}_{\mathrm{S}}(t) + \mathrm{CR}_{\mathrm{BG}}$

where $\mathrm{CR}_{\mathrm{S}}(t)$ is the count rate from the sample, which has the shape expected above, and $\mathrm{CR}_{\mathrm{BG}}$ is the count rate from the radioactive background.

We're now going to try to figure out the values that go into the expressions for $\mathrm{CR}_{\mathrm{S}}(t)$ and $\mathrm{CR}_{\mathrm{BG}}$ by using the data. What you're going to do is:

  1. "Smooth" the decay rate data over N adjacent samples in time to get rid of some of the noise. Try writing a piece of code to loop over the array of data and average the sample you're interested in along with the N samples on either side (i.e., from element i-N to i+N, for an arbitrary number of cells). Store this smoothed data in a new array (perhaps using np.zeros_like() to create the new array?).
  2. Plot your smoothed data on top of the noisy data to ensure that it agrees.
  3. Create a new array with the analytic equation from above that describes for the decay rate as a function of time, taking into account what you're seeing in point (2), and try to find the values of the various constants in the equation. Plot the new array on top of the raw data and smoothed values.

Note that code to load the file count_rates.txt has been added below, and puts the data into two numpy arrays as it did in the in-class assignment.


In [ ]:
import numpy as np
%matplotlib inline
import matplotlib.pyplot as plt

'''
count_times = the time since the start of data-taking when the data was 
               taken (in seconds)
count_rates = the number of counts since the last time data was taken, at 
               the time in count_times
'''

count_times = np.loadtxt("count_rates.txt", dtype=int)[0]
count_rates = np.loadtxt("count_rates.txt", dtype=int)[1]

In [ ]:
# Put your code here - add additional cells if necessary

# number of bins to smooth over
Nsmooth = 100

'''
create arrays for smoothed counts.  Given how we're going to 
subsample, we want to make the arrays shorter by a factor of
2*Nsmooth.  Use numpy's slicing to get times that start after
t=0 and end before the end of the array.  Then just make 
smooth_counts the same size, and zero it out.  They should be
the size of count_rates.size-2*Nsmooth
'''

smooth_times = count_times[Nsmooth:-Nsmooth]
smooth_counts = np.zeros_like(smooth_times,dtype='float64')

'''
loop over the count_rates arrays, but starting Nsmooth into 
count_rates and ending Nsmooth prior to the end.  Then, go from
i-Nsmooth to i+Nsmooth and sum those up.  After the loop, we're
going to divide by 2*Nsmooth+1 in order to normalize it (because 
each value of the smoothed array has 2*Nsmooth+1 cells in it).
'''
for i in range(Nsmooth,count_rates.size-Nsmooth):
    for j in range(i-Nsmooth,i+Nsmooth+1):  # the +1 is because it'll then end at i+Nsmooth
        smooth_counts[i-Nsmooth] += count_rates[j]

smooth_counts /= (2.0*Nsmooth+1.0)

# plot noisy counts, smoothed counts, each with their own line types
plt.plot(count_times,count_rates,'b.',smooth_times,smooth_counts,'r-',linewidth=5)

# some guesses for the various parameters in the model
# (which are basically lifted directly from the data)
N0 = 2000.0   #  counts per 5-second bin
half_life = 1712.0  # half life (in seconds)
Nbackground = 292.0   # background counts per 5-second bin

# calculate estimated count rate using the parameters listed above
count_rate_estimate = N0 * 2.0**(-count_times/half_life) + Nbackground

plt.plot(count_times,count_rate_estimate,'c--',linewidth=5)
plt.xlabel('time (seconds)')
plt.ylabel('counts per bin')
plt.title("Counts per 5-second bin")

Question: What are the constants that you came up with for the count rate equation? Do these values make sense given the experimental data? Why or why not?

The values that the students should get are approximately:

  • N0 = 2000 counts per bin (each bin is 5 seconds long)
  • half life = 1712 seconds
  • background count rate = 292 counts per bin (each bin 5 seconds)

Note that I do not expect students to get an answer that's that close - as long as the curve that is produced (dashed line above) gets reasonably close to the smoothed value, it's fine. Also, note that students may show the entire plot in counts/second instead of counts/bin - both are fine.

The plots make sense given the experimental data - the half life should be somewhere around 2000 seconds from looking at the curve, the count rates per bin for the noise should be somewhere between 200-400, and the counts at t=0 are somewhere around 2000-2200 counts/bin, after you subtract the noise from the bin.


Section 2: Great Lakes water levels

The water level in the Great Lakes fluctuates over the course of a year, and also fluctuates in many-year cycles. About two and a half years ago (in Feb. 2014), there was an article in Scientific American describing the historically low levels of the Great Lakes - in particular, that of Lake Michigan and Lake Huron, which together make up the largest body of fresh water in the world. In this part of the homework assignment, we're going to look at water height data from the Great Lakes Environmental Research Laboratory - in particular, data from 1918 to the present day. In the cell below this, we're using Pandas to load four CSV ("Comma-Separated Value") files with data from Lake Eric, Lakes Michigan and Huron combined, Lake Ontario, and Lake Superior into data frames. Each dataset contains the annual average water level for every year from 1918 to the present. Use these datasets to answer the questions posed below.


In [ ]:
import pandas

erie = pandas.read_csv('erie1918Ann.csv', skiprows=2)
miHuron = pandas.read_csv('miHuron1918Ann.csv', skiprows=2)
ontario = pandas.read_csv('ontario1918Ann.csv', skiprows=2)
superior = pandas.read_csv('superior1918Ann.csv', skiprows=2)

Question 1: Calculate the mean water levels of all of the Great Lakes over the past century (treating Lakes Michigan and Huron as a single body of water). Are all of the values similar? Why does your answer make sense? (Hint: where is Niagara Falls, and what direction does the water flow?)

Answer: Three of the values (Erie, Michigan/Huron, Superior) are all pretty similar (to within 9 or 10 meters), but Lake Ontario is about 100 meters lower. The fact that Erie/Michigan/Superior are all of similar mean height makes sense because they're connected by waterways, and the water should level out. It makes sense that Ontario is lower, because Niagara Falls flows from Lake Erie into Lake Ontario, and Niagara Falls are really high. So, it makes sense that Lake Ontario is much lower than Lake Erie.


In [ ]:
# Put your code here!

# expect to see students taking the mean value and printing it out!
erie_mean = erie['AnnAvg'].mean()
miHuron_mean = miHuron['AnnAvg'].mean()
ontario_mean = ontario['AnnAvg'].mean()
superior_mean = superior['AnnAvg'].mean()

print('Erie (meters):           ', erie_mean)
print('Michigan/Huron (meters): ', miHuron_mean)
print('Ontario (meters):        ', ontario_mean)
print('Superior (meters):       ', superior_mean)

Question 2: Make a plot where you show the fluctuations of each lake around the mean value from the last century (i.e., subtracting the mean value of the lake's water level from the data of water level over time). In general, do you see similar patterns of fluctuations in all of the lakes? What might this suggest to you about the source of the fluctuations?

Hint: you may want to use pyplot instead of the built-in Pandas plotting functions!

Answer: We do see similar patterns overall, though some of the lakes (Superior, for example) are more stable. This suggests that there's some sort of regional thing (weather, for example) that's causing fluctuations in all of the lakes.


In [ ]:
# Put your code here

# make a plot of the lakes' heights minus the mean values.
lake_erie, = plt.plot(erie['year'],erie['AnnAvg']-erie['AnnAvg'].mean(),'r-')
lake_mi,   = plt.plot(miHuron['year'],miHuron['AnnAvg']-miHuron['AnnAvg'].mean(),'g-')
lake_ont,  = plt.plot(ontario['year'],ontario['AnnAvg']-ontario['AnnAvg'].mean(),'b-')
lake_sup,  = plt.plot(superior['year'],superior['AnnAvg']-superior['AnnAvg'].mean(),'k-')
plt.xlabel('year')
plt.ylabel('value minus historic mean')
plt.title('variation around historic mean for all lakes')
plt.legend( (lake_erie,lake_mi,lake_ont,lake_sup), ('Erie','MI/Huron','Ontario','Superior'),loc='upper left')

Question 3: Finally, let's look at the original issue - the water level of the Lake Michigan+Lake Huron system and how it changes over time. When you examine just the Lake Michigan data, zooming in on only the last 20 years of data, does the decrease in water level continue, does it reverse itself, or does it stay the same? In other words, was the low level reported in 2014 something we should continue to be worried about, or was it a fluke?

Answer: The lake Michigan/Huron system data has reversed itself in the last couple of years, and has returned to historically reasonable values. It's just a fluke.


In [ ]:
# Put your code here

# basically the plot from above, but with different x limits.
lake_erie, = plt.plot(erie['year'],erie['AnnAvg']-erie['AnnAvg'].mean(),'r-')
lake_mi,   = plt.plot(miHuron['year'],miHuron['AnnAvg']-miHuron['AnnAvg'].mean(),'g-')
lake_ont,  = plt.plot(ontario['year'],ontario['AnnAvg']-ontario['AnnAvg'].mean(),'b-')
lake_sup,  = plt.plot(superior['year'],superior['AnnAvg']-superior['AnnAvg'].mean(),'k-')
plt.xlabel('year')
plt.ylabel('value minus historic mean')
plt.title('variation around historic mean for all lakes')
plt.legend( (lake_erie,lake_mi,lake_ont,lake_sup), ('Erie','MI/Huron','Ontario','Superior'),loc='upper left')
plt.xlim(1996,2017)

Section 3: Modeling drug doses in the human body

Modeling the behavior of drugs in the human body is very important in medicine. One frequently-used model is called the "Single-Compartment Drug Model", which takes the complex human body and treats it as one homogeneous unit, where drug distribution is instantaneous, the concentration of the drug in the blood (i.e., the amount of drug per volume of blood) is proportional to the drug dosage, and the rate of elimination of the drug is proportional to the amount of drug in the system. Using this model allows the prediction of the range of therapeutic doses where the drug will be effective.

We'll first model the concentration in the body of aspirin, which is commonly used to treat headaches and reduce fever. For adults, it is typical to take 1 or 2 325 mg tablets every four hours, up to a maximum of 12 tablets/day. This dose is assumed to be dissolved immediately into the blood plasma. (An average adult human has about 3 liters of blood plasma.) The concentration of drugs in the blood (represented with the symbol Q) is typically measured in $\mu$g/ml, where 1000 $\mu$g (micrograms) = 1 mg (milligram). For aspirin, the does that is effective for relieving headaches is typically between 150-300 $\mu$g/ml, and the half-life for removal of the drug from the system is about 3.2 hours (more on that later).

The rate of removal of aspirin from the body (elimination) is proportional to the amount present in the system:

$\frac{dQ}{dt} = -K Q$

Where Q is the concentration, and K is a constant of proportionality that is related to the half-life of removal of drug from the system: $K = 0.693 / t_{1/2}$.

Part 1: We're now going to make a simple model of the amount of aspirin in the human body. Assuming that an adult human has a headache and takes 2 325 mg aspirin tablets. If the drug immediately enters their system, for how long will their headache be relieved? Show a plot, with an appropriate title, x-axis label, and y-axis label, that shows the concentration of aspirin in the patient's blood over a 12-hour time span. In your model, make sure to resolve the time evolution well - make sure that your individual time steps are only a few minutes!

Put your answer immediately below, and the code you wrote (and plots you generated) to solve this immediately below that.

Answer: Their headache will be relieved for about an hour and a half, or perhaps an hour and 45 minutes (precise answer is 1.68 hours). You can tell because that's where the aspirin concentration dips below 150 micrograms/mL.


In [ ]:
#  put your code and plots here!

# concentration (Q) is in units of micrograms/milliliter
# 2 tables * (325 mg/tablet) / 3000 mL * 1000 micrograms/mg
Q_start = 2.0 * 325./3000.0*1000.  
t_half = 3.2
K = 0.693/t_half

time=0
t_end = 12.0
dt = 0.01

Q = []
t = []

Q_old = Q_start

while time <= t_end:
    
    Q_new = Q_old - K*Q_old*dt
    
    Q.append(Q_new)
    t.append(time)
    
    Q_old = Q_new
    
    time += dt

plt.plot(t,Q,'r-')
plt.plot([0,12],[150,150],'b-')
plt.plot([0,12],[300,300],'b-')
plt.ylim(0,350)
plt.xlabel('time [hours]')
plt.ylabel('concentration [micrograms/mL]')
plt.title('concentration of aspirin over time')

Part 2: We're now going to model the concentration of a drug that needs to be repeatedly administered - the drug Dilantin, which is used to treat epilepsy. The effective concentration of the drug in humans is 10-20 $\mu$g/ml, the half-life of Dilantin is approximately 22 hours, and the drug comes in 100 mg tablets which are effectively instantaneously released into your bloodstream. For this particular drug, only about 12% of the drug in each dose is actually available for absorption in the bloodstream, meaning that the effective amount added to the blood is 12 mg per dose.

Assuming that the drug is administered every 8 hours to a patient that starts out having none of the drug in their body, make a plot of the drug concentration over a ten day period and use it to answer the following two questions:

  1. How long does it take to reach an effective concentration of the drug in the patient's blood?
  2. By roughly when does the drug concentration reach a steady state in the patient's blood? (In other words, after how long is the concentration neither rising nor falling on average?)

Answer:

Assuming 100 mg per pill and 12% absorption (the corrected version of the homework): (1) we first reach the therapeutic concentration at about 24 hours, finally are completely above (i.e., never dip below) the therapeutic concentration around 40 hours. (2) We reach steady-state somewhere around 100-120 hours (~somewhat between 4-5 days).

Assuming 100 mg per pill and 100% absorption (the original version of the homework): (1) we're always above the effective concentration. (2) We reach steady-state somewhere around 120 hours (~somewhere around 5 days).


In [ ]:
#  put your code and plots here!

# concentration (Q) is in units of micrograms/milliliter
# 1 tablet * (100 mg/tablet) / 3000 mL * 1000 micrograms/mg
Q_dosage = 1.0 * 100./3000.0*1000.  
t_half = 22.0
absorption_fraction = 0.12
K = 0.693/t_half

time=0
t_end = 10.0*24.0  # 10 days
dt = 0.01

Q = []
t = []

Q_old = absorption_fraction*Q_dosage

t_dosage = 0.0

dt_dosage = 8.0

while time <= t_end:
    
    if time - t_dosage >= dt_dosage:
        Q_old += absorption_fraction*Q_dosage
        t_dosage = time
    
    Q_new = Q_old - K*Q_old*dt
    
    Q.append(Q_new)
    t.append(time)
    
    Q_old = Q_new
    
    time += dt

plt.plot(t,Q,'r-')

plt.plot([0,250],[10,10],'b-')
plt.plot([0,250],[20,20],'b-')
plt.ylim(0,25)
#plt.xlim(0,50)
plt.xlabel('time [hours]')
plt.ylabel('concentration [micrograms/mL]')
plt.title('concentration of Dilantin over time')

Section 4: Feedback (required!)


In [ ]:
from IPython.display import HTML
HTML(
"""
<iframe
    src="https://goo.gl/forms/Px7wk9DcldfyCqMt2?embedded=true" 
    width="80%" 
    height="1200px" 
    frameborder="0" 
    marginheight="0" 
    marginwidth="0">
    Loading...
</iframe>
"""
)

Congratulations, you're done!

How to submit this assignment

Log into the course Desire2Learn website (d2l.msu.edu) and go to the "Homework assignments" folder. There will be a dropbox labeled "Homework 2". Upload this notebook there.