In [39]:
import numpy as np
import pandas as pd
from numpy import array
from scipy.signal import argrelextrema
In [40]:
# read in dataset
xl = pd.ExcelFile("data/130N_Cycles_1-47.xlsx")
df = xl.parse("Specimen_RawData_1")
df
Out[40]:
This is what the dataset currently looks like - it has 170,101 rows and two columns.
The dataset contains data from 47 cycles following an experiment. The output of these experiments form the two columns:
In [41]:
# append data from time column to list
time = []
for item in df.index:
time.append(df["Time"][item])
# append data from load column to list
load = []
for item in df.index:
load.append(df["Load"][item])
In [42]:
# convert time array to np array for further processing
np_time = array(time)
In [43]:
# for local maxima
max = argrelextrema(np_time, np.greater)
print("local maxima array for time is:", max, "\n")
# for local minima
min = argrelextrema(np_time, np.less)
print("local minima array for time is:", min)
The arrays above actually look empty...
After further research into Python's numpy library, I realized that argrelextrema with np.greater or np.less does NOT consider repeated values to be relative maxima (https://github.com/scipy/scipy/issues/3749).
A strict inequality is required to satisfy both sides of the point.
In [44]:
# for local maxima
max_ = argrelextrema(np_time, np.greater_equal)
print("local maxima array for time is:", max_, "\n")
# for local minima
min_ = argrelextrema(np_time, np.less_equal)
print("local minima array for time is:", min_)
I applied the _equal parameter to my argrelextrema function and notice that no duplicate values have occurred, which is a good sign so far.
In [45]:
print("The length of the max array for time is:",np.size(max_), "\n")
print("The length of the min array for time is:",np.size(min_), "\n")
However, it's odd that the numbers returned from each array is only 15...considering there are 47 cycles present in the dataset. So, I will try another method instead (https://docs.scipy.org/doc/numpy/reference/generated/numpy.r_.html).
In [46]:
row_wise_merging = np.r_[True, np_time[1:] < np_time[:-1]] & np.r_[np_time[:-1] < np_time[1:], True]
In [47]:
# print side of row_wise_merging
size = np.size(row_wise_merging)
print(size)
print(row_wise_merging)
In [48]:
count = 0
for i in np.nditer(row_wise_merging):
if i == True:
count +=1
print(count)
The output of numpy's row wise merging method does not prove to be helpful as well.
Since I am not familiar with how to compute local maxima and minima manually on my own, I will need to do further research on that.
As of now, I am planning to proceed with other calculations based on the array returned from argrelextrema. When I retrieve the right array for time, I believe I can modify my presented algorithm below.
In [49]:
# places indices returned from local maxima into a list
local_max_indices = []
for idx in np.nditer(max_):
local_max_indices.append(idx)
print(local_max_indices)
In [50]:
# create a list of sums of time and load up until
# index in local_max_indices list
concat_data = []
for idx, (t, l) in enumerate(zip(time, load)):
# print(idx, t, l)
for item in local_max_indices:
if idx == item:
concat_data.append((sum(time[:idx]),sum(load[:idx])))
for item in range(len(concat_data)):
print("Cycle", item)
print("Time:", concat_data[item][0])
print("Load:", concat_data[item][1])
print("\n")
As mentioned before, the results above are unrealistic since we know that there are 47 cycles (rather than the 15 that were outputted) that exist. Once I have the correct values returned from the local maxima function, I can proceed with modifying my code for that array.
My next step: to implement an algorithm that would make the actual predictions.
In [ ]:
In [ ]: