In [1]:
import research as r
import scipy
import pandas
def fitness_plots(path, treatment):
D = r.load_files(r.find_files(path, treatment+"_.*/fitness.dat"))
figure()
for i,g in D.groupby('trial'):
plot(g['update'], g['max_fitness'])
ylabel('Fitness')
xlabel('Update')
figure()
r.quick_ciplot('update','max_fitness', D)
ylabel('Fitness')
xlabel('Update')
# savefig("centroid.pdf")
final = D[D['update']==D['update'].max()]
print "\nDominant fitness:"
print final.ix[final['max_fitness'].idxmax()]
return D
002-joint attempts to discover the parameters describing the joint of two known normal distributions, where the genome holds exactly 2 reals (2 per distribution). 003-varjoint does the same, but the genome is allowed to vary in size. In both cases, fitness is again based on the K-S test statistic. These joint distributions were constructed via rejection sampling.
Results: They both work. Dominant individual for 002-joint discovered at 415u/26g in replicate ta0_30, while for 003-varjoint the dominant was discovered at 473u/33g. In 003-varjoint, the dominant genome held exactly 4 parameters; all ancestors began with a genome of length 2. This suggests that we'd be able to discover the decomposition of arbitrarily complex joint probability distributions.
In [4]:
D = fitness_plots('../var/002-joint','ta0')
In [5]:
D = fitness_plots('../var/003-varjoint','ta0')
... where we try to match the parameters of a known normal distribution with a genetic algorithm. Fitness is $1.0/D_{n,n'}$, which is the Kolmogorov-Smirnov test statistic. Smaller values of this test statistic indicate less distance between the EDFs of the known normal distribution and that parameterized by the GA.
For this test, the known distribution has $\mu=10.0$ and $\sigma=1.0$. Each genome holds two evolvable real values, one each for $\mu$ and $\sigma$. During mutation, these values are drawn at random from the uniform distribution $[0.0,30.0)$.
The dominant individual, disovered at 271u/16g in trial ta0_9, has genome $[9.99492, 0.997972]$.
In [2]:
D = fitness_plots('../var/001-normal','ta0')
In [3]:
# for convenience, here is inverted fitness, which is exactly D_{n,n'}:
D['dnn'] = 1.0/D['max_fitness']
figure()
r.quick_ciplot('update','dnn', D)
ylabel('D_{n,n\'}')
xlabel('Update')
Out[3]: