DIGITS = 2
one example
after encoding
and # used to pad the strings to same length
>>> train_X.shape, train_y.shape
>>> (5050, 5), (5050, 3, 12)
>>> reversed_X = train_X[:, ::-1]
>>> combined_X = np.concatenate([train_X, reversed_X], axis=0)
>>> combined_y = np.concatenate([train_y, train_y], axis=0)
>>> combined_X.shape, combined_y.shape
>>> (10100, 5), (10100, 3, 12)
使用数据多的模型训练的结果反而更差?
after reversing, the above encoded input/output will be
reversed input: #2+21, output: #51
presumely, such reversing will confuse the model, as 2+21# --> 23#.
Thus, we can cancel the padding term, and add one end-of-string term
one example
new encoding manner
second example
thus, lengths of input/output will no longer be fixed
In [22]:
import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
In [35]:
#csv_file_1 = '/home/huizhu/huizhu/feynman/addition/logs/logger_1.csv'
csv_file_2 = '/home/huizhu/huizhu/feynman/addition/logs/logger.csv'
#df_1 = pd.read_csv(csv_file_1)
df_1 = pd.read_csv(csv_file_2)
plt.plot(df_1.epoch, df_1.acc, label='training data')
plt.plot(df_1.epoch, df_1.val_acc, label='validation data')
plt.xlabel('epoch')
plt.ylabel('accuracy')
plt.legend()
plt.show()
In [32]:
plt.plot(df_2.epoch, df_2.acc, label='training data')
plt.plot(df_2.epoch, df_2.val_acc, label='validation data')
plt.xlabel('epoch')
plt.ylabel('accuracy')
plt.legend()
plt.show()
In [33]:
csv_file = '/home/huizhu/huizhu/feynman/addition/logs/logger_2.csv'
plot(csv_file)
In [10]:
def plot(csv_file):
df = pd.read_csv(csv_file)
plt.plot(df.epoch, df.acc, label='train')
plt.plot(df.epoch, df.val_acc, label='valid')
plt.show()
Out[10]:
In [11]:
x[:, ::-1]
Out[11]:
In [ ]: