Vanilla rain prediction

Currently this is done using only some elements of the data, super quick and dirty


In [2]:
import pandas as pd
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt


/home/paul/anaconda2/lib/python2.7/site-packages/matplotlib/font_manager.py:273: UserWarning: Matplotlib is building the font cache using fc-list. This may take a moment.
  warnings.warn('Matplotlib is building the font cache using fc-list. This may take a moment.')

In [11]:
historic = pd.read_csv("2016_2.csv", header=0) #2016_2.csv is the heavily cut version of 2016.csv
rain = pd.read_csv("2016_labels.csv", header=0)
rain2 = pd.get_dummies(rain)

In [12]:
rain


Out[12]:
0.8
0 0.0
1 0.0
2 0.0
3 0.0
4 0.0
5 0.0
6 2.8
7 2.0
8 18.4
9 0.2
10 4.4
11 0.4
12 0.2
13 1.8
14 0.8
15 0.0
16 2.2
17 0.0
18 0.0
19 0.4
20 0.0
21 0.0
22 0.0
23 0.0
24 0.2
25 0.2
26 1.0
27 0.0
28 0.0
29 2.6
... ...
333 0.0
334 0.0
335 0.6
336 3.0
337 4.6
338 0.0
339 0.6
340 0.0
341 0.2
342 14.6
343 3.4
344 0.0
345 0.4
346 4.0
347 3.0
348 7.4
349 0.8
350 0.0
351 0.0
352 1.4
353 0.0
354 0.0
355 5.4
356 0.0
357 19.8
358 0.0
359 0.0
360 6.0
361 0.4
362 1.8

363 rows × 1 columns


In [5]:
#as before, x contains the data of the training set. (need to decide which columns should be kept)
x = tf.placeholder(tf.float32, [None, 6])

n = 2    #numbers of days back the algo can see (not currently implemented)

#weights for the two states, percipitation vs dry (could easily add an additional state for rain vs snow)
W = tf.Variable(tf.zeros([6, 2]))
b = tf.Variable(tf.zeros([2]))

#predicted percipitation result
y = tf.nn.softmax(tf.matmul(x, W) + b)
#actual result
y_ = tf.placeholder(tf.float32, [None, 2])

cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))

In [ ]:
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
#The actual training is defined here. The algorithm being used is gradient descent, although this may be changed 
#depending on user preference/the task at hand. 

init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)

In [6]:
historic


Out[6]:
Date/Time Month Day Max Temp (°C) Min Temp (°C) Mean Temp (°C) Heat Deg Days (°C) Cool Deg Days (°C) Total Rain (mm) Total Rain Flag Total Snow (cm) Total Snow Flag Total Precip (mm) Total Precip Flag Snow on Grnd (cm) Dir of Max Gust (10s deg) Spd of Max Gust (km/h)
0 2016-01-01 1 1 -0.3 -4.2 -2.3 20.3 0.0 0.0 NaN 0.8 NaN 0.8 NaN 0.0 25.0 46
1 2016-01-02 1 2 0.3 -4.3 -2.0 20.0 0.0 0.0 NaN 0.0 T 0.0 T 0.0 23.0 44
2 2016-01-03 1 3 1.6 -11.6 -5.0 23.0 0.0 0.0 NaN 0.0 T 0.0 T 0.0 34.0 54
3 2016-01-04 1 4 -11.2 -15.4 -13.3 31.3 0.0 0.0 NaN 0.0 T 0.0 T 0.0 2.0 37
4 2016-01-05 1 5 -2.6 -15.2 -8.9 26.9 0.0 0.0 NaN 0.0 NaN 0.0 NaN 0.0 23.0 32
5 2016-01-06 1 6 2.4 -6.8 -2.2 20.2 0.0 0.0 NaN 0.0 NaN 0.0 NaN 0.0 23.0 32
6 2016-01-07 1 7 4.4 -5.3 -0.5 18.5 0.0 0.0 NaN 0.0 NaN 0.0 NaN NaN NaN <31
7 2016-01-08 1 8 4.1 -5.6 -0.8 18.8 0.0 2.8 NaN 0.0 NaN 2.8 NaN NaN NaN <31
8 2016-01-09 1 9 8.6 2.8 5.7 12.3 0.0 2.0 NaN 0.0 NaN 2.0 NaN 0.0 NaN <31
9 2016-01-10 1 10 6.8 -6.5 0.2 17.8 0.0 18.2 NaN 0.2 NaN 18.4 NaN 0.0 28.0 85
10 2016-01-11 1 11 -5.9 -10.5 -8.2 26.2 0.0 0.2 NaN 0.0 T 0.2 NaN 0.0 27.0 65
11 2016-01-12 1 12 -1.9 -9.7 -5.8 23.8 0.0 0.0 NaN 5.0 NaN 4.4 NaN 2.0 28.0 65
12 2016-01-13 1 13 -7.0 -10.5 -8.8 26.8 0.0 0.0 NaN 0.4 NaN 0.4 NaN 5.0 28.0 50
13 2016-01-14 1 14 -2.1 -8.5 -5.3 23.3 0.0 0.0 NaN 0.2 NaN 0.2 NaN 5.0 25.0 41
14 2016-01-15 1 15 4.2 -3.0 0.6 17.4 0.0 1.8 NaN 0.0 NaN 1.8 NaN 2.0 10.0 32
15 2016-01-16 1 16 3.5 -4.4 -0.5 18.5 0.0 0.8 NaN 0.0 T 0.8 NaN 0.0 30.0 63
16 2016-01-17 1 17 0.1 -10.6 -5.3 23.3 0.0 0.0 NaN 0.0 T 0.0 T 0.0 26.0 46
17 2016-01-18 1 18 -7.8 -11.8 -9.8 27.8 0.0 0.0 NaN 2.4 NaN 2.2 NaN 2.0 30.0 52
18 2016-01-19 1 19 -3.9 -14.5 -9.2 27.2 0.0 0.0 NaN 0.0 T 0.0 T 1.0 31.0 67
19 2016-01-20 1 20 -3.9 -8.1 -6.0 24.0 0.0 0.0 NaN 0.0 T 0.0 T 1.0 29.0 35
20 2016-01-21 1 21 -3.8 -10.0 -6.9 24.9 0.0 0.0 NaN 0.6 NaN 0.4 NaN 1.0 NaN <31
21 2016-01-22 1 22 -2.2 -12.4 -7.3 25.3 0.0 0.0 NaN 0.0 NaN 0.0 NaN 1.0 NaN <31
22 2016-01-23 1 23 -3.7 -11.2 -7.5 25.5 0.0 0.0 NaN 0.0 NaN 0.0 NaN 0.0 36.0 32
23 2016-01-24 1 24 -0.8 -11.0 -5.9 23.9 0.0 0.0 NaN 0.0 NaN 0.0 NaN 0.0 NaN <31
24 2016-01-25 1 25 4.0 -1.3 1.4 16.6 0.0 0.0 T 0.0 NaN 0.0 T 0.0 20.0 35
25 2016-01-26 1 26 6.6 0.4 3.5 14.5 0.0 0.2 NaN 0.0 T 0.2 NaN 0.0 24.0 70
26 2016-01-27 1 27 1.0 -3.2 -1.1 19.1 0.0 0.0 NaN 0.2 NaN 0.2 NaN 0.0 22.0 57
27 2016-01-28 1 28 1.6 -1.8 -0.1 18.1 0.0 0.0 NaN 0.8 NaN 1.0 NaN 0.0 21.0 54
28 2016-01-29 1 29 -0.3 -10.7 -5.5 23.5 0.0 0.0 NaN 0.0 T 0.0 T 1.0 28.0 50
29 2016-01-30 1 30 5.9 -8.4 -1.3 19.3 0.0 0.0 NaN 0.0 NaN 0.0 NaN NaN 24.0 46
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
336 2016-12-02 12 2 5.7 1.9 3.8 14.2 0.0 0.0 T 0.0 NaN 0.0 T NaN 26.0 61
337 2016-12-03 12 3 6.0 0.1 3.1 14.9 0.0 0.0 NaN 0.0 NaN 0.0 NaN NaN 30.0 52
338 2016-12-04 12 4 2.8 0.3 1.6 16.4 0.0 NaN M 1.2 NaN 0.6 NaN 1.0 11.0 39
339 2016-12-05 12 5 5.8 0.3 3.1 14.9 0.0 0.0 NaN 0.1 NaN 3.0 NaN 0.0 26.0 54
340 2016-12-06 12 6 5.4 -1.1 2.2 15.8 0.0 4.5 NaN 0.1 NaN 4.6 NaN NaN 11.0 48
341 2016-12-07 12 7 3.3 -1.1 1.1 16.9 0.0 0.0 T 0.0 NaN 0.0 T NaN 26.0 50
342 2016-12-08 12 8 0.0 -4.4 -2.2 20.2 0.0 0.0 NaN 0.8 NaN 0.6 NaN 0.0 25.0 59
343 2016-12-09 12 9 -0.9 -7.2 -4.1 22.1 0.0 0.0 T 0.0 NaN 0.0 T 0.0 28.0 48
344 2016-12-10 12 10 -1.4 -7.1 -4.3 22.3 0.0 0.0 NaN 0.2 NaN 0.2 NaN 0.0 27.0 54
345 2016-12-11 12 11 -0.1 -7.2 -3.7 21.7 0.0 0.0 NaN 13.8 NaN 14.6 NaN 0.0 14.0 35
346 2016-12-12 12 12 2.4 -2.7 -0.2 18.2 0.0 0.0 NaN 2.8 NaN 3.4 NaN 15.0 28.0 63
347 2016-12-13 12 13 -1.3 -6.6 -4.0 22.0 0.0 0.0 T 0.0 NaN 0.0 T 8.0 27.0 54
348 2016-12-14 12 14 -5.1 -10.6 -7.9 25.9 0.0 0.0 NaN 0.4 NaN 0.4 NaN 8.0 25.0 67
349 2016-12-15 12 15 -7.9 -12.9 -10.4 28.4 0.0 0.0 NaN 4.0 NaN 4.0 NaN 8.0 27.0 82
350 2016-12-16 12 16 -5.7 -11.3 -8.5 26.5 0.0 0.0 NaN 3.0 NaN 3.0 NaN 10.0 26.0 43
351 2016-12-17 12 17 -0.3 -6.3 -3.3 21.3 0.0 1.6 NaN 6.0 NaN 7.4 NaN 18.0 16.0 32
352 2016-12-18 12 18 -1.1 -13.8 -7.5 25.5 0.0 0.6 NaN 0.2 NaN 0.8 NaN 19.0 30.0 59
353 2016-12-19 12 19 -7.6 -12.9 -10.3 28.3 0.0 0.0 NaN 0.0 NaN 0.0 NaN 19.0 22.0 33
354 2016-12-20 12 20 0.3 -10.1 -4.9 22.9 0.0 0.0 NaN 0.0 NaN 0.0 NaN 19.0 22.0 67
355 2016-12-21 12 21 -0.6 -8.1 -4.4 22.4 0.0 0.0 NaN 1.4 NaN 1.4 NaN 19.0 22.0 32
356 2016-12-22 12 22 3.3 -0.7 1.3 16.7 0.0 0.0 T 0.0 NaN 0.0 T 19.0 31.0 43
357 2016-12-23 12 23 2.6 -2.6 0.0 18.0 0.0 0.0 NaN 0.0 NaN 0.0 NaN 17.0 23.0 43
358 2016-12-24 12 24 3.3 0.9 2.1 15.9 0.0 5.3 NaN 0.1 NaN 5.4 NaN 12.0 28.0 37
359 2016-12-25 12 25 2.6 -2.9 -0.2 18.2 0.0 0.0 NaN 0.0 NaN 0.0 NaN 8.0 11.0 43
360 2016-12-26 12 26 9.7 -3.6 3.1 14.9 0.0 19.8 NaN 0.0 NaN 19.8 NaN 7.0 25.0 63
361 2016-12-27 12 27 6.7 -3.1 1.8 16.2 0.0 0.0 T 0.0 NaN 0.0 T 1.0 26.0 70
362 2016-12-28 12 28 0.6 -3.7 -1.6 19.6 0.0 0.0 T 0.0 NaN 0.0 T 1.0 26.0 43
363 2016-12-29 12 29 1.9 -1.6 0.2 17.8 0.0 0.0 NaN 6.0 NaN 6.0 NaN 5.0 28.0 44
364 2016-12-30 12 30 -0.1 -3.6 -1.9 19.9 0.0 0.0 NaN 0.4 NaN 0.4 NaN 5.0 29.0 46
365 2016-12-31 12 31 3.7 -3.5 0.1 17.9 0.0 0.8 NaN 1.0 NaN 1.8 NaN 4.0 23.0 56

366 rows × 17 columns

Code from past tutorial, take from it as needed

import tensorflow as tf

x = tf.placeholder(tf.float32, [None, 784])

there are 784 pixels in any image, note that the matrix has been flattened.

'None' simply means that the number of rows in this 2d array can be anything

W = tf.Variable(tf.zeros([784, 10]))

For each of the ten digits, W contains the weights that we expect for each pixel

b = tf.Variable(tf.zeros([10]))

The bias term required to build the softmax model

y = tf.nn.softmax(tf.matmul(x, W) + b)

this is a [None,10] matrix, which gives the likelihood for every digit produced by our model

y_ = tf.placeholder(tf.float32, [None, 10])

this is the actual digit shown, used for training

cross_entropy = tf.reduce_mean(-tf.reducesum(y * tf.log(y), reduction_indices=[1]))

calculating the cross-entropy of the model. The reduce_mean and reduce_sum functions do NOT perform

optimizations, but rather they simply reduce the dimensionality of the input according to certain rules.

More about these functions can be read here:

https://www.tensorflow.org/api_docs/python/math_ops/reduction#reduce_sum

train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)

The actual training is defined here. The algorithm being used is gradient descent, although this may be changed

depending on user preference/the task at hand.

init = tf.global_variables_initializer() sess = tf.Session() sess.run(init)

this section initializes the environment, and is required to run the previously defined operations

for i in range(100000): batch_xs, batch_ys = mnist.train.next_batch(100) sess.run(train_step, feed_dict={x: batchxs, y: batch_ys})

Here we're running the training over 1000 iterations

correctprediction = tf.equal(tf.argmax(y,1), tf.argmax(y,1)) accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) print(sess.run(accuracy, feeddict={x: mnist.test.images, y: mnist.test.labels}))

these last few lines are simply to test the accuracy of our model


In [15]:
tf.reduce_sum(x)


Out[15]:
<tf.Tensor 'Sum_3:0' shape=() dtype=int32>

aslkjdfhapsdjhflaskdjfghlsakdjf


In [ ]: