线性分段平滑

构造分段函数

$\begin{equation} f(x)= \begin{cases} x &\mbox{x<50,}\\ 2x-50 &\mbox{others;} \end{cases} \end{equation}$

构造测试数据集

测试数据集加高斯均匀扰动--rd.uniform(-2,2)*rd.gauss(0,4)

构造评价函数

  1. interval:对两段测试函数的mse直接求和;
  2. continuity1:在interval基础上增加间断点连续性判断指标:$e^{\Delta Y_s(50)-\alpha}-1$,其中$e^{\Delta Y_s(50)}$ 代表拟合分段函数在50处的左右间断点差的绝对值,$\alpha $默认代表$Y$值域范围的$1\%$大小;
  3. continuity2:在interval基础上增加间断点一阶导出评价指标:$e^{\frac{\Delta \sigma-\beta}{10e}}-1$,其中$\Delta \sigma$代表左右间断点处的左右导数的$arctan$差得绝对值,$\beta$如未特殊说明全局默认为10度($\frac {\pi}{18}$);
  4. continuity3:综合$2$与$3$,构造评价函数。

实验


In [1]:
import random as rd
import cma
import math
from multiprocessing import Pool
%pylab
from __future__ import division
plt.rc('figure', figsize=(16, 9))


Using matplotlib backend: MacOSX
Populating the interactive namespace from numpy and matplotlib

In [2]:
#构造一维数据集
def linefunc(x):
	if x<50:
		return np.poly1d([1,0])(x)+rd.uniform(-2,2)*rd.gauss(0,4)
	return np.poly1d([2,-40])(x)+rd.uniform(-2,2)*rd.gauss(0,4)
E=math.exp
DIM=4
PI=math.pi
X=range(100)
Y=map(linefunc,X)
RANGE=max(Y)-min(Y)
ALPHA=0.01
BETA=PI/18.0

In [3]:
def realLineFunc(param):
	def f(x):
		if x<50:
			return np.poly1d([param[0],param[1]])(x)
		return np.poly1d([param[2],param[3]])(x)
	return f

def fittingFunc(param):
	return np.poly1d(param)

def mse(x,y,param):
	f1=fittingFunc(param[:2])
	f2=fittingFunc(param[2:4])
	s=0
	for i in range(50):
		s+=(y[i]-f1(x[i]))**2
	for i in range(50,100):
		s+=(y[i]-f2(x[i]))**2
	return math.sqrt(s/100.0)
#间断点
def evalfunc1(param):
	f1=fittingFunc(param[:2])
	f2=fittingFunc(param[2:4])
	s=mse(X,Y,param)/RANGE
	a=abs(f1(50)-f2(50))/RANGE
	return s+E(a-ALPHA)-1
#间断点一阶导数
def evalfunc2(param):
	s=mse(X,Y,param)/RANGE
	b=abs(math.atan(param[0])-math.atan(param[2]))
	return s+(E((b-BETA)/(10*E(1)))-1)
#考虑以上两种
def evalfunc3(param):
	f1=fittingFunc(param[:2])
	f2=fittingFunc(param[2:4])
	s=mse(X,Y,param)/RANGE
	a=abs(f1(50)-f2(50))/RANGE
	b=abs(math.atan(param[0])-math.atan(param[2]))
	return s+(E((b-BETA)/(10*math.e))-1)+E(a-ALPHA)-1
#不考虑其他因素
def evalfunc(param):
	s=mse(X,Y,param)
	return s/RANGE

def cmaUser(func):
    pool=Pool()
    es = cma.CMAEvolutionStrategy(DIM * [1], 0.3,{'popsize':15})
    while not es.stop() :
        solutions = es.ask()
        es.tell(solutions,pool.map(func,solutions))
    print 'eval value:%s'%es.result()[1]
    return es.result()[0]
def draw(title):
    global X
    res=cmaUser(evalfunc)
    res1=cmaUser(evalfunc1)
    res2=cmaUser(evalfunc2)
    res3=cmaUser(evalfunc3)
    plt.figure(1) 
    plt.plot(X,Y,'b.',alpha=0.6,label="measure point")
    X1=X+[49.99]
    X1.sort()
    plt.plot(X1,map(realLineFunc(res),X1),c="red",lw=2,ls="-",alpha=0.7,label="interval")
    plt.plot(X1,map(realLineFunc(res1),X1),c="blue",lw=2,ls="-",alpha=0.7,label="continuity1")
    plt.plot(X1,map(realLineFunc(res2),X1),c="green",lw=2,ls="-",alpha=0.7,label="continuity2")
    plt.plot(X1,map(realLineFunc(res3),X1),c="black",lw=2,ls="--",alpha=0.7,label="continuity3")
    plt.legend(loc='best')
    plt.xlabel('x')
    plt.ylabel('y')
    plt.title(title)
    plt.show()
    plt.figure(2)
    plt.plot(X,Y,'b.',alpha=0.6,label="measure point")
    plt.plot(X1,map(realLineFunc([1,0,2,-40]),X1),c="cyan",lw=2,ls="-",alpha=0.7,label="real function")
    plt.plot(X1,map(realLineFunc(res3),X1),c="black",lw=2,ls="--",alpha=0.7,label="continuity3")

In [ ]:
draw('fitting of interval line')
4种策略下最优评价函数值
interval continuity1 continuity2 continuity3
0.0309909258183 0.0237448247489 0.0339220595649 0.0256780571605

最优拟合与真实曲线对比图

二次曲线分段线性平滑

构造分段函数

构造函数$f(x)=100-x^2$,在定义域$x\in [-7,7]$上均匀划分7段进行线性拟合。

构造测试数据集


In [3]:
#构造数据集
E=math.exp
DIM=14
PI=math.pi
def linefunc2(x):
    return 100-x**2++rd.uniform(-3,3)*rd.gauss(0,4)
X=np.linspace(-7,7,301)
Y=map(linefunc2,X)
RANGE=max(Y)-min(Y)
ALPHA=0.01
BETA=PI/4.0
scatter(X,Y)


Out[3]:
<matplotlib.collections.PathCollection at 0x1139ef210>

实验

划分七段用CMA线性拟合14个参数


In [9]:
def realLineFunc(param):
    def f(x):
        if x<-5:
            return np.poly1d([param[0],param[1]])(x)
        elif x<-3:
            return np.poly1d([param[2],param[3]])(x)
        elif x<-1:
            return np.poly1d([param[4],param[5]])(x)
        elif x<1:
            return np.poly1d([param[6],param[7]])(x)
        elif x<3:
            return np.poly1d([param[8],param[9]])(x)
        elif x<5:
            return np.poly1d([param[10],param[11]])(x)
        else:
            return np.poly1d([param[12],param[13]])(x)
    return f
def mse(x,y,param):
    s=0
    for j in range(0,14,2):
        for i in range(int(j/2)*10,int(j/2+1)*10,1):
            s+=(y[i]-fittingFunc(param[j:j+2])(x[i]))**2
    return math.sqrt(s/200.0) 
 #间断点
def evalfunc1(param):
    s=mse(X,Y,param)/RANGE
    a=0
    for j in range(2,14,2):
        a+=E(abs(fittingFunc(param[j-2:j])(X[(j/2)*10])-fittingFunc(param[j:j+2])(X[(j/2)*10]))/RANGE-ALPHA)-1
    return s+a
#间断点一阶导数
def evalfunc2(param):
    s=mse(X,Y,param)/RANGE
    b=0
    for j in range(2,14,2):
        b+=(E((abs(math.atan(param[j-2])-math.atan(param[j]))-BETA))-1)/(100*math.e)
    return s+b
#考虑两种
def evalfunc3(param):
    s=mse(X,Y,param)/RANGE
    a=0
    for j in range(2,14,2):
        a+=E(abs(fittingFunc(param[j-2:j])(X[(j/2)*10])-fittingFunc(param[j:j+2])(X[(j/2)*10]))/RANGE-ALPHA)-1
    b=0
    for j in range(2,14,2):
        b+=(E((abs(math.atan(param[j-2])-math.atan(param[j]))-BETA)/(100*math.e))-1)
    return s+a+b
#不考虑其他因素
def evalfunc(param):
    s=mse(X,Y,param)/RANGE
    return s

def draw(title):
    global X
    res=cmaUser(evalfunc)
    res1=cmaUser(evalfunc1)
    res2=cmaUser(evalfunc2)
    res3=cmaUser(evalfunc3)

    plt.figure(1) 
    plt.plot(X,Y,'b.',alpha=0.6,label="measure point")
    X1=np.insert(X,0,[-5.01,-3.01,-1.01,0.99,2.99,4.99])
    X1.sort()
    plt.plot(X1,map(realLineFunc(res),X1),c="red",lw=2,ls="-",alpha=0.7,label="interval")
    plt.plot(X1,map(realLineFunc(res1),X1),c="blue",lw=2,ls="-",alpha=0.7,label="continuity1")
    plt.plot(X1,map(realLineFunc(res2),X1),c="green",lw=2,ls="-",alpha=0.7,label="continuity2")
    plt.plot(X1,map(realLineFunc(res3),X1),c="black",lw=2,ls="--",alpha=0.7,label="continuity3")
    plt.legend(loc='best')
    plt.xlabel('x')
    plt.ylabel('y')
    plt.title(title)
    plt.show()
    plt.figure(2)
    plt.plot(X,Y,'b.',alpha=0.6,label="measure point")
    plt.plot(X1,map(lambda x:100-x**2,X1),c="cyan",lw=2,ls="-",alpha=0.7,label="real function")
    plt.plot(X1,map(realLineFunc(res3),X1),c="black",lw=2,ls="--",alpha=0.7,label="continuity3")
    plt.legend(loc='best')
    plt.title("Compare best fitting curve with real curve ")
    plt.xlabel('x')
    plt.ylabel('y')
    plt.show()

In [ ]:
draw('fitting of interval  quadratic curve with line')
4种策略下最优评价函数值
interval continuity1 continuity2 continuity3
0.0171697128166 -0.0416912639382 0.0146455354793 -0.0480936701006

设置BETA=PI/4.0(45度)

二次曲线分段二次平滑

利用二次曲线$y=ax^2+bx+c$进行分段拟合


In [9]:
def realLineFunc(param):
    def f(x):
        if x<-5:
            return np.poly1d([param[0],param[1],param[2]])(x)
        elif x<-3:
            return np.poly1d([param[3],param[4],param[5]])(x)
        elif x<-1:
            return np.poly1d([param[6],param[7],param[8]])(x)
        elif x<1:
            return np.poly1d([param[9],param[10],param[11]])(x)
        elif x<3:
            return np.poly1d([param[12],param[13],param[14]])(x)
        elif x<5:
            return np.poly1d([param[15],param[16],param[17]])(x)
        else:
            return np.poly1d([param[18],param[19],param[20]])(x)
    return f
def mse(x,y,param):
    s=0
    for j in range(0,21,3):
        for i in range(int(j/3)*10,int(j/3+1)*10,1):
            s+=(y[i]-fittingFunc(param[j:j+3])(x[i]))**2
    return math.sqrt(s/200.0) 
 #间断点
def evalfunc1(param):
    s=mse(X,Y,param)/RANGE
    a=0
    for j in range(3,21,3):
        a+=E(abs(fittingFunc(param[j-3:j])(X[(j/3)*10])-fittingFunc(param[j:j+3])(X[(j/3)*10]))/RANGE-ALPHA)-1
    return s+a
#间断点一阶导数
def evalfunc2(param):
    s=mse(X,Y,param)/RANGE
    b=0
    for j in range(3,21,3):
        b+=(E((abs(math.atan(2*param[j-3]+param[j-2])-math.atan(2*param[j])+param(j+1))-BETA))-1)/(10*math.e)
    return s+b
#考虑两种
def evalfunc3(param):
    s=mse(X,Y,param)/RANGE
    a=0
    for j in range(3,21,3):
        a+=E(abs(fittingFunc(param[j-3:j])(X[(j/3)*10])-fittingFunc(param[j:j+3])(X[(j/3)*10]))/RANGE-ALPHA)-1
    b=0
    for j in range(3,21,3):
        b+=(E((abs(math.atan(2*param[j-3]+param[j-2])-math.atan(2*param[j])+param(j+1))-BETA))-1)/(10*math.e)
    return s+a+b
#不考虑其他因素
def evalfunc(param):
    s=mse(X,Y,param)/RANGE
    return s

def draw(title):
    global X

    res=cmaUser(evalfunc)
    res1=cmaUser(evalfunc1)
    res2=cmaUser(evalfunc2)
    res3=cmaUser(evalfunc3)

    plt.figure(1) 
    plt.plot(X,Y,'b.',alpha=0.6,label="measure point")
    X1=np.insert(X,0,[-5.01,-3.01,-1.01,0.99,2.99,4.99])
    X1.sort()

    plt.plot(X1,map(realLineFunc(res),X1),c="red",lw=2,ls="-",alpha=0.7,label="interval")
    plt.plot(X1,map(realLineFunc(res1),X1),c="blue",lw=2,ls="-",alpha=0.7,label="continuity1")
    plt.plot(X1,map(realLineFunc(res2),X1),c="green",lw=2,ls="-",alpha=0.7,label="continuity2")
    plt.plot(X1,map(realLineFunc(res3),X1),c="black",lw=2,ls="--",alpha=0.7,label="continuity3")
    plt.legend(loc='best')
    plt.xlabel('x')
    plt.ylabel('y')
    plt.title(title)
    plt.show()
    plt.figure(2)
    plt.plot(X,Y,'b.',alpha=0.6,label="measure point")
    plt.plot(X1,map(realLineFunc([1,0,2,-40]),X1),c="cyan",lw=2,ls="-",alpha=0.7,label="real function")
    plt.plot(X1,map(realLineFunc(res3),X1),c="black",lw=2,ls="--",alpha=0.7,label="continuity3")
    plt.legend(loc='best')
    plt.title("Compare best fitting curve with real curve ")
    plt.xlabel('x')
    plt.ylabel('y')
    plt.show()

In [ ]:
draw('fitting of interval  quadratic curve with curve')
4种策略下最优评价函数值
interval continuity1 continuity2 continuity3
0.0166098265438 -0.0418342446941 -0.0174967646088 -0.0757380809252

正弦函数分段二次平滑

构造函数

构造正弦函数$y=10sin0.6x , x\in [-10,10]$,分七段用二次曲线进行拟合

构造数据集


In [9]:
#构造数据集
def linefunc3(x):
    return 10*math.sin(0.7*x)+rd.uniform(-1.5,1.5)*rd.gauss(0,5)
X=np.linspace(-7,7,301)
Y=map(linefunc3,X)
DIM=14
RANGE=max(Y)-min(Y)
ALPHA=0.01*RANGE
BETA=PI/18.0
scatter(X,Y)


Out[9]:
<matplotlib.collections.PathCollection at 0x117734090>


In [ ]:
draw('fitting of interval  sin curve with curve')
4种策略下最优评价函数值
interval continuity1 continuity2 continuity3
0.0340212286044 -0.0243540062207 0.00398263132657 -0.0545520520545


In [5]:
draw('fitting of interval  quadratic curve with line')
4种策略下最优评价函数值
interval continuity1 continuity2 continuity3
0.0345438972758 -0.0215188700765 0.0403446057001 -0.0237311926117

设置BETA=PI/4.0(45度)