Supplemental information for Assignment 5 (Strawberry Creek Discharge assignment)

Here we will look at two different ways to encode a solution for discharge using the cross-sectional area method. Keep in mind that these techniques also apply to the dilution gaging exercise, where, instead of calculating and adding incremental discharge values, you will be calculating and adding incremental fluxes.

Example setup

Let's say that we have divided a stream cross-section into thirds, and that we have velocity and depth measurements at two points across the stream (i.e., the interior edges of the three sections), as shown on the diagram in the assignment text and reproduced below.

The data table in the assignment is reproduced in example.csv. Make sure you specify the correct directory when the file is read in below.

First, as always, let us load in the libraries that we need...


In [1]:
# Import numerical tools
import numpy as np

#Import pandas for reading in and managing data
import pandas as pd

# Import pyplot for plotting
import matplotlib.pyplot as plt

#Import seaborn (useful for plotting)
import seaborn as sns

# Magic function to make matplotlib inline; other style specs must come AFTER
%matplotlib inline

%config InlineBackend.figure_formats = {'svg',}
#%config InlineBackend.figure_formats = {'png', 'retina'}

Now let's read in and examine the datafile...


In [2]:
# Use pd.read_csv() to read in the data and store in a DataFrame
fname = 'example.csv' #If you have not saved the file in your "current" directory, you will
#need to specify the full path here
df = pd.read_csv(fname)
df.head()


Out[2]:
Position on tape (m) Station number on current meter Depth (cm) Velocity (cm/s) Notes
0 0.0 NaN 0 0.0 LEW
1 0.2 1.0 5 0.7 BND poor, measurement repeated
2 0.4 2.0 10 1.1 NaN
3 0.6 NaN 0 0.0 REW

Solving for discharge

We will now solve for discharge in two ways: 1) Using a for loop, and 2) Using an array. The two methods are equivalent here, but the second is much more computationally efficient. While you will not notice a difference in run-time for this small dataset, it would make a substantial difference if you were dealing with a much larger dataset.

But first, let's read in the columns that we need from the data table above, as this is a step that needs to be taken for both methods:


In [4]:
y = df['Position on tape (m)'] #Define the y-coordinate as the distance across the stream from LEW.
d = df['Depth (cm)']
v = df['Velocity (cm/s)']
#Now we have three arrays: position, depth, and velocity

Using FOR loops

Method 1A

Here we will calculate incremental discharge, section by section, and add it to our running total discharge, Q_running.


In [6]:
#Initialize the running discharge.
Q_run = 0
for i in range(len(y)-1): #Remember this index starts with 0 and will end with 2. We added the
    #'-1' because each polygon for which we are calculating incremental discharge has two edges,
    #so there is one less polygon than there are lines in our table.
    
    #Below, we calculate incremental discharge as average depth*average velocity*width of increment
    #(converted to cm from m)
    Q_incremental = (d[i]+d[i+1])/2*(v[i]+v[i+1])/2*(y[i+1]-y[i])*100 # [cm^3/s]
    Q_run = Q_run + Q_incremental #Add the incremental discharge to the running total
    
Q = Q_run #Final discharge = running total discharge after moving through every polygon in the 
#cross-section
print(Q)


207.5

Here is a common coding mistake to avoid: Notice the parentheses within the for statement above. A common mistake might be to instead write for i in range(len(y-1)):. This would be wrong, because it would first evaluate y-1, which will return the array [-1.0, -0.8, -0.6, -0.4], and then it would evaluate the length of that array, which is 4. range(4) would then return the values 0 through 3, which is one more value than we intended to loop through. With the code instead written correctly as in the cell above, len(y) is first evaluated (returning 4), and then 1 is subtracted from 4 to obtain range(3), giving us values 0 through 2.

Method 1B

Here we will save each incremental discharge as an array, and then add all of the incremental discharges together at the end (rather than within the for-loop) to get total discharge.


In [11]:
# Initialize the array of incremental discharges.
Q_incremental = np.zeros(len(y)-1) #Again, this will be one row smaller than your data table,
#since each row in the data table represents an edge of the polygon, while your incremental
#discharge applies to each polygon itself.
for i in range(len(y)-1): 
    Q_incremental[i] = (d[i]+d[i+1])/2*(v[i]+v[i+1])/2*(y[i+1]-y[i])*100 # [cm^3/s]
    
#Now, outside of the for loop, add all of the incremental discharges together.
Q = np.sum(Q_incremental)
print(Q)


207.5

Using array operations

Finally, we can avoid using for loops altogether by simply adding elements that are in arrays. First, we want to convert our data frame columns to numpy arrays (because if you just pull certain rows out of a data frame array, pandas thinks that the original indices still apply).


In [33]:
#First, create arrays that represent the velocity, depth, and position on the right and left
#edges of each polygon
d_left = np.array(d[0:len(d)-1])
d_right = np.array(d[1:len(d)])
v_left = np.array(v[0:len(v)-1])
v_right = np.array(v[1:len(v)])
y_left = np.array(y[0:len(y)-1])
y_right = np.array(y[1:len(y)])

#Now create a new array of incremental discharge by performing mathematical operations that
#are performed on all rows of the above arrays at once. When array operations are performed, the
#new variable that you are creating does not need to be initialized, as in a for loop.
Q_incremental = (d_right+d_left)/2*(v_right+v_left)/2*(y_right-y_left)*100

#Now add the incremental discharges together.
Q = np.sum(Q_incremental)
print(Q)


207.5

Voila! You should see that all total discharge values are the same! Now go forth and apply your new learning to Strawberry Creek!