build a neural network to predict the magnitude of an Earthquake given the date, time, Latitude, and Longitude as features. This is the dataset. Optimize at least 1 hyperparameter using Random Search. See this example for more information.
You can use any library you like, bonus points are given if you do this using only numpy.
In [235]:
%matplotlib inline
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
In [236]:
df = pd.read_csv("data/earthquake-database.csv")
print(df.shape)
df.head()
Out[236]:
We're using date, time, Latitude, and Longitude to predict the magnitude.
In [347]:
#prediction_cols = ["Date", "Time", "Latitude", "Longitude"]
# ignoring time for now
prediction_cols = ["Date", "Latitude", "Longitude"]
x = df[prediction_cols]
x.head()
Out[347]:
y is the target for the prediction:
In [348]:
y = df["Magnitude"]
y.head()
Out[348]:
We need to convert the input data into something better suited for prediction. The date and time are strings which doesn't work at all, and latitude and longitude could be normalized.
But first, check to see if the input data has any missing values:
In [350]:
x.info()
There is a value in each of the rows, so moving ahead, first we change the date string into a pandas datetime
In [351]:
x.loc[:,'Date'] = x.loc[:,'Date'].apply(pd.to_datetime)
x.head()
Out[351]:
In [362]:
x.info()
x['Date'].items()
Out[362]:
In [359]:
# normalize the target y
y = (y - y.min()) / (y.max() - y.min())
In [304]:
x_train = x[:20000]
y_train = y[:20000]
y_test = x[20000:]
y_test = y[20000:]
len(x_train), len(y_train), len(y_test), len(y_test)
Out[304]:
In [298]:
input_features = 3
output_features = 1
data_length = len(x_train)
In [295]:
weights = np.random.random([input_features, data_length])
weights.shape
Out[295]:
In [323]:
# testing how to loop through the data
t = x[:10]
for a,b in t.iterrows():
print(b[0], '|', b[1], '|', b[2])
In [ ]: