Learning constants in a symbolic regression task (deprecated)

One of the long standing "skeletons in the closet" of GP techniques is the constant finding problem. It is widely acknowledged that the ephemeral random constant approach, de facto the main solution proposed to this problem, is far from being satisfactory.

Using dCGP, we are here able to successfully learn constants as well as expressions during evolution thanks to the hybridization of the evolutionary strategy with a, second order, gradient descent approach that learns the exact value of ephemeral constants, thus avoiding to build such a value by applying kernel functions to the constants.

NOTE: since v1.4 symbolic regression is performed via dedicated classes and not manipulating directly the dcgpy.expression

Lets first import dcgpy and pyaudi and set up things as to compute our CGP using the type "gdual" and thus get for free all derivatives


In [1]:
from dcgpy import expression_gdual_vdouble as expression
from dcgpy import kernel_set_gdual_vdouble as kernel_set
from pyaudi import gdual_vdouble as gdual
import pyaudi
from matplotlib import pyplot as plt
import numpy as np
from random import randint
%matplotlib inline

The kernel functions


In [2]:
# note he use of the protected division "pdiv" (not necessary here)
# note the call operator (returns the list of kernels)
kernels = kernel_set(["sum", "mul", "diff","pdiv"])()

The ES-(1+$\lambda$) algorithm


In [41]:
def run_experiment(dCGP, offsprings, max_gen, x, yt, screen_output):
    # The offsprings chromosome, fitness and constant
    chromosome = [1] * offsprings
    fitness = [1] *offsprings
    constant = [1]*offsprings
    # Init the best as the initial random dCGP
    best_chromosome = dCGP.get()
    best_constant = 1.
    fit, _ = err2(dCGP, x, yt, best_constant)
    best_fitness = sum(fit.constant_cf)
    # Main loop over generations
    for g in range(max_gen):
        for i in range(offsprings):
            dCGP.set(best_chromosome)
            cumsum=0
            dCGP.mutate_active(i+1)
            fit, constant[i] = err2(dCGP, x, yt, best_constant)
            fitness[i] = sum(fit.constant_cf )
            chromosome[i] = dCGP.get()
        for i in range(offsprings):
            if fitness[i] <= best_fitness:
                if (fitness[i] != best_fitness):
                    best_chromosome = chromosome[i]
                    best_fitness = fitness[i]
                    best_constant = constant[i]
                    dCGP.set(best_chromosome)
                    if screen_output:
                        print("New best found: gen: ", g, " value: ", fitness[i],  dCGP.simplify(["x","c"]), "c =", best_constant)

        if best_fitness < 1e-14:
            break
    return g, best_chromosome, best_constant

The test problems

As target functions, we define three different problems of increasing complexity:

P1: $x^5 - \pi x^3 + x$

P2: $x^5 - \pi x^3 + \frac{2\pi}x$

P3: $\frac{e x^5 + x^3}{x + 1}$

P4: $\sin(\pi x) + \frac 1x$

note how $\pi$ and $e$ are present in the expressions.


In [45]:
# The following functions create the target values for a gridded input x for different test problems
def data_P1(x):
    return x**5 - np.pi*x**3 + x
def data_P2(x):
    return x**5 - np.pi*x**3 + 2*np.pi / x
def data_P3(x):
    return (np.e*x**5 + x**3)/(x + 1)
def data_P4(x):
    return pyaudi.sin(np.pi * x) + 1./x
def data_P5(x):
    return np.e * x**5 - np.pi*x**3 + np.sqrt(2) * x

The error functions


In [5]:
# This is the quadratic error of a dCGP expression when the constant value is cin. The error is computed
# over the input points xin (of type gdual, order 0 as we are not interested in expanding the program w.r.t. these)
# The target values are contained in yt (of type gdual, order 0 as we are not interested in expanding the program w.r.t. these)
def err(dCGP, xin, yt, cin):
    c = gdual([cin], "c", 2)
    y = dCGP([xin,c])[0]
    return (y-yt)**2

# This is the quadratic error of the expression when the constant value is learned using a, one step, 
# second order method.
def err2(dCGP, xin, yt,cin):
    c = gdual([cin], "c", 2)
    y = dCGP([xin,c])[0]
    error = err(dCGP,xin,yt,cin)
    dc =  sum(error.get_derivative({"dc":1}))
    dc2 = sum(error.get_derivative({"dc":2}))
    if dc2 != 0:
        learned_constant = c - dc/dc2
        y = dCGP([xin, learned_constant])[0]
    else:
        learned_constant = c
    return (y-yt)**2, learned_constant.constant_cf[0]

Problem P1: $x^5 - \pi x^3 + x$


In [17]:
x = np.linspace(1,3,10)
x = gdual(x)
yt = data_P1(x)

In [18]:
# We run nexp experiments and accumulate statistic for the ERT
nexp = 100
offsprings = 4
max_gen=1000
res = []
kernels = kernel_set(["sum", "mul", "diff","pdiv"])() 
print("restart: \t gen: \t expression:")
for i in range(nexp):
    dCGP = expression(inputs=2, outputs=1, rows=1, cols=15, levels_back=16, arity=2, kernels=kernels, seed = randint(0,1233456))
    g, best_chromosome, best_constant = run_experiment(dCGP, offsprings,max_gen,x,yt, screen_output=False)
    res.append(g)
    dCGP.set(best_chromosome)
    if g < (max_gen-1):
        print(i, "\t\t", res[i], "\t", dCGP(["x","c"]), " a.k.a ", dCGP.simplify(["x","c"]), "c = ", best_constant)
res = np.array(res)


restart: 	 gen: 	 expression:
5 		 163 	 ['((((x*x)-c)/(x/((x*x)*(x*x))))+x)']  a.k.a  [-c*x**3 + x**5 + x] c =  3.141592653589793
8 		 452 	 ['(x+((((((x/c)*(c*x))-(c+c))*((x/c)*(c*x)))-((x/c)*(c*x)))*x))']  a.k.a  [-2*c*x**3 + x**5 - x**3 + x] c =  1.0707963267948972
12 		 309 	 ['(((x*x)*(((c-(x*x))*(x*x))/(((c-(x*x))*(x*x))*x)))-(((c-(x*x))*(x*x))*x))']  a.k.a  [-c*x**3 + x**5 + x] c =  3.141592653589793
15 		 764 	 ['(((x/c)*c)+((x*(x*x))*((x*x)-c)))']  a.k.a  [-c*x**3 + x**5 + x] c =  3.1415926535897927
18 		 193 	 ['(((x*(x*x))*((x*x)+c))-(c-(c+x)))']  a.k.a  [c*x**3 + x**5 + x] c =  -3.1415926535897927
36 		 334 	 ['(x+((x*x)*(((((x*x)-c)+x)*x)-(x*x))))']  a.k.a  [-c*x**3 + x**5 + x] c =  3.1415926535897927
54 		 826 	 ['((((x*x)*x)*((c+(x*x))+c))+x)']  a.k.a  [2*c*x**3 + x**5 + x] c =  -1.5707963267948968
57 		 154 	 ['(((x+((x*x)+c))-((x*x)+c))+(((x*x)+c)*((x*x)*x)))']  a.k.a  [c*x**3 + x**5 + x] c =  -3.1415926535897927
58 		 312 	 ['((((x*x)+c)*(x*(x*x)))-((x*(x*x))+((x*(x*x))-x)))']  a.k.a  [c*x**3 + x**5 - 2*x**3 + x] c =  -1.1415926535897931
66 		 378 	 ['(((((x*x)-c)*(x*x))*x)+x)']  a.k.a  [-c*x**3 + x**5 + x] c =  3.1415926535897927
72 		 68 	 ['(((c/x)/((c/x)/x))-((((x*c)*x)*x)-((((x*c)*x)*x)/((c/x)/x))))']  a.k.a  [-c*x**3 + x**5 + x] c =  3.141592653589794
77 		 67 	 ['(x+((x*x)*(x*((x*x)+c))))']  a.k.a  [c*x**3 + x**5 + x] c =  -3.1415926535897927
81 		 891 	 ['(((x*(x/(x/x)))+(x*((x*(x/(x/x)))*((x*(x/(x/x)))+(c+(x/x))))))-((x*(x/(x/x)))-x))']  a.k.a  [c*x**3 + x**5 + x**3 + x] c =  -4.141592653589793
83 		 815 	 ['(((x*x)*(((x*x)-c)*x))+((x*c)/c))']  a.k.a  [-c*x**3 + x**5 + x] c =  3.1415926535897927
97 		 210 	 ['((((x*x)/c)+((x*x)*(x*x)))*(x+(c/x)))']  a.k.a  [c*x**3 + x**5 + x + x**3/c] c =  -2.782160919529689

In [8]:
mean_gen = sum(res) / sum(res<(max_gen-1))
print("ERT Expected run time = avg. number of dCGP evaluations needed: ", mean_gen * offsprings)


ERT Expected run time = avg. number of dCGP evaluations needed:  19902.4444444

Problem P2 - $x^5 - \pi x^3 + \frac{2\pi}x$


In [9]:
x = np.linspace(0.1,5,10) # we include points close to zero here to favour learning of 1/x
x = gdual(x)
yt = data_P2(x)

In [10]:
# We run nexp experiments and accumulate statistic for the ERT
nexp = 100
offsprings = 4
max_gen=5000
res = []
kernels = kernel_set(["sum", "mul", "diff","pdiv"])() 
print("restart: \t gen: \t expression:")
for i in range(nexp):
    dCGP = expression(inputs=2, outputs=1, rows=1, cols=15, levels_back=16, arity=2, kernels=kernels, seed = randint(0,1233456))
    g, best_chromosome, best_constant = run_experiment(dCGP, offsprings,max_gen,x,yt, screen_output=False)
    res.append(g)
    dCGP.set(best_chromosome)
    if g < (max_gen-1):
        print(i, "\t\t", res[i], "\t", dCGP(["x","c"]), " a.k.a ", dCGP.simplify(["x","c"]), "c = ", best_constant)
res = np.array(res)


restart: 	 gen: 	 expression:
2 		 4663 	 ['(((x-((((x*x)-c)*x)/(x*x)))+((((x*x)-c)*x)*(x*x)))+(x-((((x*x)-c)*x)/(x*x))))']  a.k.a  [-c*x**3 + 2*c/x + x**5] c =  3.1415926535897913
6 		 1520 	 ['((x*((x*x)*((x*x)+c)))+((((x*x)-c)-((x*x)+c))/x))']  a.k.a  [c*x**3 - 2*c/x + x**5] c =  -3.1415926535897927
15 		 2945 	 ['(((x*((c*x)/c))*((x*(x*((c*x)/c)))+(c*x)))-((c+c)*(c/(c*x))))']  a.k.a  [c*x**3 - 2*c/x + x**5] c =  -3.1415926535897927
18 		 1937 	 ['((((x*x)*x)*(c+(x*x)))-(((x/(x*x))*c)+((x/(x*x))*c)))']  a.k.a  [c*x**3 - 2*c/x + x**5] c =  -3.141592653589793
20 		 3489 	 ['(x*((c/(x*x))-(((x*x)*(c-(x*x)))-(c/(x*x)))))']  a.k.a  [-c*x**3 + 2*c/x + x**5] c =  3.1415926535897905
21 		 3427 	 ['((((c+(x*x))*((x*x)*x))+x)-(((c+(x*x))*(x/(x*x)))-(x-((c+(x*x))*(x/(x*x))))))']  a.k.a  [c*x**3 - 2*c/x + x**5] c =  -3.1415926535897944
36 		 1999 	 ['((((x*x)*(x*c))*((x*x)/c))-(((x*x)*(x*c))-((c+c)/x)))']  a.k.a  [-c*x**3 + 2*c/x + x**5] c =  3.141592653589792
44 		 3911 	 ['((((c+(x*x))*(x*x))-((c/(x*x))+(c/(x*x))))*x)']  a.k.a  [c*x**3 - 2*c/x + x**5] c =  -3.141592653589789
52 		 4993 	 ['((((x-x)/(c+c))-((x*(x*x))*(c-(x*x))))+((c+c)/x))']  a.k.a  [-c*x**3 + 2*c/x + x**5] c =  3.141592653589792
56 		 3723 	 ['(((x-((c+(x*x))/x))+(x-((c+(x*x))/x)))+((c+(x*x))*(x*(x*x))))']  a.k.a  [c*x**3 - 2*c/x + x**5] c =  -3.1415926535897913
68 		 3551 	 ['(((((x*x)-c)-(c+(x*x)))/x)+((c+(x*x))*((x*x)*x)))']  a.k.a  [c*x**3 - 2*c/x + x**5] c =  -3.1415926535897936
69 		 1716 	 ['(((((x*x)-c)*(x*x))*x)+((c+c)/x))']  a.k.a  [-c*x**3 + 2*c/x + x**5] c =  3.141592653589794
75 		 315 	 ['(((((x*x)*((x*x)+c))/(c/x))*((c/x)*x))-((c/x)+(c/x)))']  a.k.a  [c*x**3 - 2*c/x + x**5] c =  -3.1415926535897922
76 		 3441 	 ['((((x*x)*((x*x)+c))*x)-((c/x)+(c/x)))']  a.k.a  [c*x**3 - 2*c/x + x**5] c =  -3.141592653589797
79 		 1366 	 ['(((c/x)+(c/x))+((((x*x)-c)*x)/((c/x)/((x*x)*(c/x)))))']  a.k.a  [-c*x**3 + 2*c/x + x**5] c =  3.1415926535897927

In [11]:
mean_gen = sum(res) / sum(res<(max_gen-1))
print("ERT Expected run time = avg. number of dCGP evaluations needed: ", mean_gen * offsprings)


ERT Expected run time = avg. number of dCGP evaluations needed:  124776.266667

Problem P3 - $\frac{e x^5 + x^3}{x + 1}$


In [12]:
x = np.linspace(-0.9,1,10)
x = gdual(x)
yt = data_P3(x)

In [13]:
# We run nexp experiments and accumulate statistic for the ERT
nexp = 100
offsprings = 4
max_gen=5000
res = []
kernels = kernel_set(["sum", "mul", "diff","pdiv"])() 
print("restart: \t gen: \t expression:")
for i in range(nexp):
    dCGP = expression(inputs=2, outputs=1, rows=1, cols=15, levels_back=16, arity=2, kernels=kernels, seed = randint(0,1233456))
    g, best_chromosome, best_constant = run_experiment(dCGP, offsprings,max_gen,x,yt, screen_output=False)
    res.append(g)
    dCGP.set(best_chromosome)
    if g < (max_gen-1):
        print(i, "\t\t", res[i], "\t", dCGP(["x","c"]), " a.k.a ", dCGP.simplify(["x","c"]), "c = ", best_constant)
res = np.array(res)


restart: 	 gen: 	 expression:
1 		 2610 	 ['((((x*x)-((x*x)*(c*(x*x))))*(x*x))/(x+(x*x)))']  a.k.a  [-c*x**6/(x**2 + x) + x**4/(x**2 + x)] c =  -2.7182818284590438
5 		 3618 	 ['(((x*(x*(c*x)))*(x/((c*x)+(x*(c*x)))))*(((x-c)/(x-c))+(x*(c*x))))']  a.k.a  [c**2*x**6/(c*x**2 + c*x) + c*x**4/(c*x**2 + c*x)] c =  2.7182818284590478
10 		 3956 	 ['(((x*x)*((x*x)+(((x*x)*c)*(x*x))))/((x*x)+x))']  a.k.a  [c*x**6/(x**2 + x) + x**4/(x**2 + x)] c =  2.7182818284590455
14 		 4942 	 ['((((((c*x)*x)/c)*(((c/c)+((c*x)*x))*((c*x)*x)))*(c/c))/((c*x)+((c*x)*x)))']  a.k.a  [c**2*x**6/(c*x**2 + c*x) + c*x**4/(c*x**2 + c*x)] c =  2.718281828459043
27 		 4107 	 ['((((c*(x*x))*(x*x))+(x*x))*(((c*(x*x))*(x*x))/((x*(c*(x*x)))+((x*(c*(x*x)))*x))))']  a.k.a  [c**2*x**8/(c*x**4 + c*x**3) + c*x**6/(c*x**4 + c*x**3)] c =  2.7182818284590287
77 		 1921 	 ['((x*((x*x)+(x*(((x*x)*c)*x))))/(((x*((x*x)+(x*(((x*x)*c)*x))))/(x*((x*x)+(x*(((x*x)*c)*x)))))+x))']  a.k.a  [c*x**5/(x + 1) + x**3/(x + 1)] c =  2.7182818284590455
78 		 3011 	 ['(((x-((x*(c*x))*x))/((x*(c*x))+((x*(c*x))/x)))*((x*(c*x))*x))']  a.k.a  [-c**2*x**6/(c*x**2 + c*x) + c*x**4/(c*x**2 + c*x)] c =  -2.7182818284590495

In [14]:
mean_gen = sum(res) / sum(res<(max_gen-1))
print("ERT Expected run time = avg. number of dCGP evaluations needed: ", mean_gen * offsprings)


ERT Expected run time = avg. number of dCGP evaluations needed:  279469.714286

Problem P4: $\sin(\pi x) + \frac 1x$


In [42]:
x = np.linspace(-1,1,10)
x = gdual(x)
yt = data_P4(x)

In [43]:
# We run nexp experiments and accumulate statistic for the ERT
nexp = 100
offsprings = 4
max_gen=5000
res = []
kernels = kernel_set(["sum", "mul", "diff","pdiv","sin"])() 
print("restart: \t gen: \t expression:")
for i in range(nexp):
    dCGP = expression(inputs=2, outputs=1, rows=1, cols=15, levels_back=16, arity=2, kernels=kernels, seed = randint(0,1233456))
    g, best_chromosome, best_constant = run_experiment(dCGP, offsprings,max_gen,x,yt, screen_output=False)
    res.append(g)
    dCGP.set(best_chromosome)
    if g < (max_gen-1):
        print(i, "\t\t", res[i], "\t", dCGP(["x","c"]), " a.k.a ", dCGP.simplify(["x","c"]), "c = ", best_constant)
res = np.array(res)


restart: 	 gen: 	 expression:
5 		 2914 	 ['(sin((x/c))+(((x-(x/c))/((x*(x-(x/c)))/c))/(x/(x/c))))']  a.k.a  [sin(x/c) + 1/x] c =  0.3183098859546274
12 		 808 	 ['((c/((x*c)+((x*c)-(x*c))))+sin((x+((x*c)+((x*c)-(x*c))))))']  a.k.a  [sin(c*x + x) + 1/x] c =  2.1415926536812546
17 		 1434 	 ['(sin(((x*c)+x))+(c/(x*c)))']  a.k.a  [sin(c*x + x) + 1/x] c =  2.1415926536986984
28 		 1079 	 ['((c/(x*c))+sin((x+((x*c)+(x*c)))))']  a.k.a  [sin(2*c*x + x) + 1/x] c =  1.0707963267948972
29 		 2113 	 ['(sin((((x/c)/c)+(x+x)))+((x/x)/x))']  a.k.a  [sin(2*x + x/c**2) + 1/x] c =  -0.935932260872573
35 		 3739 	 ['(((c/c)/x)+sin(((c-(c-x))+((x*(c+(c/c)))+x))))']  a.k.a  [sin(c*x + 3*x) + 1/x] c =  0.14159265368124904
36 		 127 	 ['((c/(x*c))-(sin(((x*c)-c))*((c/(x*c))/(c/(x*c)))))']  a.k.a  [-sin(c*x - c) + 1/x] c =  3.1415926536189205
39 		 805 	 ['(sin(((x*c)+(x*c)))+(c/(x*c)))']  a.k.a  [sin(2*c*x) + 1/x] c =  1.5707963267952043
46 		 4412 	 ['(sin(((x/c)+x))+((sin(((x/c)+x))+x)/((sin(((x/c)+x))+x)*x)))']  a.k.a  [sin(x + x/c) + 1/x] c =  0.4669422068457747
53 		 2197 	 ['((((c/x)*(c/x))*(((c/(c/x))/c)/c))+sin((((c/(c/x))/c)/c)))']  a.k.a  [sin(x/c**2) + 1/x] c =  0.5641895835477555
54 		 4082 	 ['((((c+(sin((x*c))*sin((x*c))))/(sin((x*c))*sin((x*c))))-c)/((((c+(sin((x*c))*sin((x*c))))/(sin((x*c))*sin((x*c))))-c)/((c/(x*c))+sin((x*c)))))']  a.k.a  [sin(c*x) + 1/x] c =  3.1415926535902257
59 		 601 	 ['(sin((x*c))+(c/(x*c)))']  a.k.a  [sin(c*x) + 1/x] c =  3.1415926537675927
64 		 4031 	 ['(sin((x+(x*c)))+(c/(x*c)))']  a.k.a  [sin(c*x + x) + 1/x] c =  2.1415926536986984
66 		 1069 	 ['((x/(x*x))+sin(((x+x)+(x/sin(c)))))']  a.k.a  [sin(2*x + x/sin(c)) + 1/x] c =  2.0741512360814633
73 		 978 	 ['((c/(x*c))+sin((x*c)))']  a.k.a  [sin(c*x) + 1/x] c =  3.141592653745218
74 		 2798 	 ['(sin((c*x))+(((c*x)/(c*x))/x))']  a.k.a  [sin(c*x) + 1/x] c =  3.141592653590368
81 		 2771 	 ['(sin((x+(x*c)))+(c/(x*c)))']  a.k.a  [sin(c*x + x) + 1/x] c =  2.1415926536986984
83 		 2865 	 ['(sin(((x-(x*c))-(x*c)))+(c/(x*c)))']  a.k.a  [-sin(2*c*x - x) + 1/x] c =  -1.0707963269360328
90 		 3688 	 ['(sin(((c*(x*c))-x))+(c/(x*c)))']  a.k.a  [sin(c**2*x - x) + 1/x] c =  2.035090330583488

In [44]:
mean_gen = sum(res) / sum(res<(max_gen-1))
print("ERT Expected run time = avg. number of dCGP evaluations needed: ", mean_gen * offsprings)


ERT Expected run time = avg. number of dCGP evaluations needed:  94195.7894737

In [ ]: