Sketchbook to experiment with RNN models to model the dynamics. Once tested, the code in this notebook will be incorporated into functions/classes/ python modules.
Models the dynamics of the system p(next observation | history of observations, action)
History of observations consist of exercises a student has done and whether the student solved each of them
Action is the next exercise chosen
Next observation is whether the student gets the chosen exercise correct
We want to use an RNN to model the dynamics. Input data represents history of observations, of shape (n_students, n_timesteps, observation_vec_size)
Output represents the probability of getting next exercise correctly, of shape (n_students, n_timesteps, n_exercises)
So at each timestep, we make a prediction for all actions.
For each action, the output vector specifies the predicted probability of the student getting the chosen exercise correctly.
The target output only contains binary values.
In [75]:
import sys
print sys.executable
%load_ext autoreload
%autoreload 2
%reload_ext autoreload
In [9]:
import sonnet as snt
import tensorflow as tf
import tflearn
import numpy as np
In [10]:
import dataset_utils
In [11]:
data = dataset_utils.load_data(filename="../synthetic_data/toy.pickle")
In [12]:
print ("number of students: {}".format(len(data)))
In [13]:
print ("sequence length for each student: {}".format(len(data[0])))
In [14]:
student_sample = data[0]
t = 25
student_at_t = student_sample[t]
In [15]:
exer, perf, knowl = student_at_t
In [16]:
print ("Exercise Concept: {} \nPerformance (1 means solved exercise): {} \nKnowledge (which concepts student knows): {}".format(np.argmax(exer), perf, knowl))
input data shape (n_students, n_timesteps, observation_vec_size)
In [17]:
# concatenate
observ_concat = np.append(exer, perf)
print observ_concat
In [18]:
# flip
observ_flip = exer * (2*perf-1)
print observ_flip
In [19]:
# extend
observ_extend = np.zeros(2*len(exer))
if perf == 1:
observ_extend[:len(exer)] = exer
else:
observ_extend[len(exer):] = exer
In [20]:
print observ_extend
Note that the output of the RNN at timestep t is a vector of length n_exercises, each element representing the probability that a student will get that exercise correctly. Targets shape: (n_students, n_timesteps, n_exercises)
For training, we calculate the loss only over the outputs corresponding to the observed exercises, so the ones the student actually did.
Therefore, we need an outputmask, to mask out all other exercises the student did not do. the output mask is a one hot vector for each timestep, corresponding to the exercise the student did at t.
In [21]:
next_ex, next_perf, next_knowl = student_sample[t+1]
In [22]:
print next_perf
# actions corresponds to number of exercises. Right now, each exercise practices one concept.
So # exercises = # concepts.
In [23]:
n_concepts = 10
n_exercises = n_concepts
In [24]:
target_vec = np.zeros(n_exercises)
output_mask = np.zeros(n_exercises)
In [25]:
exercise_ix = np.argmax(next_ex) # for current data set, this works. In the future, if exercise doesn't correspond to just a single concept, we would have to use exercise IDs.
output_mask[exercise_ix] = 1
target_vec[exercise_ix] = next_perf
In [26]:
print output_mask
print target_vec
In [28]:
n_students = len(data)
n_timesteps = len(data[0])
exer = data[0][0][0]
n_concepts = len(exer)
n_inputdim = 2 * n_concepts
n_exercises = n_concepts
n_outputdim = n_exercises
In [29]:
print n_students
In [30]:
print n_timesteps
In [31]:
print n_inputdim
In [65]:
input_data_ = np.zeros((n_students, n_timesteps, n_inputdim))
output_mask_ = np.zeros((n_students, n_timesteps, n_outputdim))
target_data_ = np.zeros((n_students, n_timesteps, n_outputdim))
In [66]:
print input_data.shape
In [67]:
for i in xrange(n_students):
for t in xrange(n_timesteps-1):
cur_sample = data[i][t]
next_sample = data[i][t+1]
exer, perf, knowl = cur_sample
next_exer, next_perf, next_knowl = next_sample
next_exer_ix = np.argmax(next_exer)
observ = np.zeros(2*len(exer))
if perf == 1:
observ[:len(exer)] = exer
else:
observ[len(exer):] = exer
input_data_[i,t,:] = observ[:]
output_mask_[i,t,next_exer_ix] = 1
target_data_[i,t,next_exer_ix] = next_perf
In [78]:
data = dataset_utils.load_data(filename="../synthetic_data/toy.pickle")
input_data_, output_mask_, target_data_ = dataset_utils.preprocess_data_for_rnn(data)