J Labs

Best Fit

(1 of 12) Best Fit to a Monomial

This lab is based on an article [1] in the Computer Corner section of The College Mathematics Journal.

The article deals with two methods of fitting a curve of the form y = f(x) = ax^k to a set of data points using least squares.

The original article has numerical examples done with the Mathematica computer package.

In this lab we show how to do the computations using the J programming language.

[1] Helen Skala: "Will the Real Best Fit Curve ```Please Stand Up?"```
```College Math. Journal```
```Vol. 27, May 1996```

(2 of 12) Two Methods

Given a set of data points S = {(x_i,y_i) | i=1,...,n}, we wish to fit a curve y = f(x) = ax^k. What are the best values of a and k? We will select a and k to satisfy a least squares condition.

Method 1: linear least squares on the logs of the data Because log y = log a + k log x, we can try to fit the logs of the data to a straight line with equation y = b + mx where b = log a and m = k. The least squares problem is:

```minimize sum_{i=1 to n}(b + k log x_i - log y_i)^2```

Method 2: nonlinear least squares on the original data

```minimize sum_{i=1 to n}(f(x_i) - y_i)^2```

where f(x) = ax^k

(3 of 12) Method 1:

Let the given data be

```vx = 1, 2, 3, 4, 5, 10, 20```
```vy = 6.1, 9.2, 14.1, 21.2, 30.3, 105.1, 405.2```

Compute the logs. We will use the same notation as the article and store the logs in variables xlg and ylg.


In [ ]:
vx =: 1 2 3 4 5 10 20

In [ ]:
vy =: 6.1 9.2 14.1 21.2 30.3 105.1 405.2

In [ ]:
log =: ^.   NB. Give the natural log function the name log

In [ ]:
[xlg =: log vx

In [ ]:
[ylg =: log vy

(4 of 12) Method 1: (ctd)

The formulas for linear least squares are well known. They are quoted in the article and then programmed in Mathematica.

Here is the J version: [see below]

Thus the best monomial fit is y = 3.74x^1.44 [see below]


In [ ]:
e =: ^ 1   NB. e = exp(1) is Euler's number

In [ ]:
n =: #vx    NB. n is the number of data points

In [ ]:
num =: (+/xlg*ylg)-(+/xlg)*(+/ylg)%n  NB. +/ is summation

In [ ]:
den =: (+/xlg*xlg)-((+/xlg)^2)%n      NB. % is division

In [ ]:
[k =: num%den

In [ ]:
b =: ((+/ylg)-k*(+/xlg))%n

In [ ]:
[a =: e^b

(5 of 12) Method 1: (ctd)

To check the quality of the fit, we can calculate the points predicted by the curve

```yp = ax^k```

and then sum of squares of the errors

```yp-vy```

The value, over 15,000, is quite large. [see below]

At any time, you can interact with the lab. For example, to see the predicted values just type

```yp```

To see a list of the errors, type

```yp-vy```


In [ ]:
yp =: a*vx^k

In [ ]:
+/(yp-vy)^2

(6 of 12) Shortcut for Method 1:

J has a built-in linear least squares facility which we can use as an alternative to the formulas given above.

Consider the matrix equation AX = B where A and B are given matrices and X is an unknown matrix. If A is a square nonsingular matrix then X = A^(-1)B. In J, the above computation of X from A and B would be written as

```X =: B %. A```

and the operation "%." is called "matrix divide". More generally, if A has more rows than columns then B %. A is the least squares best approximation to a solution X of AX = B.

With the use of matrix divide we can get a useful shortcut in method 1.

(7 of 12) Shortcut for Method 1: (ctd)

In our case, we have n equations of the form

```(1)b + (log x_i)k = y_i```

in the two unknowns b and k. Thus to apply matrix divide:

```A = [1, log x_i], X = [b k], and B = [y_i]```

[The J code is shown below. Note that we get the same values for a and k that we had previously obtained.]


In [ ]:
A =: 1,.xlg

In [ ]:
'b k' =: ylg %. A

In [ ]:
[a =: e^b

In [ ]:
k

(8 of 12) Method 2:

For the nonlinear least squares version, we cannot rely on standard formulas; but we can minimize using calculus.

We regard the sum of squares as a function of two variables, a and k. To minimize this function, we set the partial derivatives with respect to a and k equal to zero. This gives two equations for the two unknowns a and k.

In the article, the first equation i.e. from the partial with respect to a, is solved algebraically for a in terms of k. The resulting expression is plugged into the second equation to give a single equation for k. This equation is then solved numerically using the Mathematica program FindRoot. Below is the J version for this approach.

(9 of 12) Method 2: (ctd)

First we create a program FindRoot, an implementation of Newtons Method in J. Newtons Method applied to f involves iterating the function

```N(x) = x - f(x)/f'(x)```

starting with an initial approximation x0 to the root. Thus for some power p,

```N^p(x0)```

is an acceptable approximation to the root.

In J, a function f is called a "verb". We define an "adverb" "Nwtn" that modifies f so that "f Nwtn" is the function N.

We then iterate an infinite number of times. J will automatically stop the iteration when successive approximations agree to machine tolerance.


In [ ]:
Nwtn =: adverb def 'y - (u y)%(u D.1 y)'

f Nwtn x does one iteration of Newtons method y represents the right argument x u represents the left argument f D.1 is first derivative


In [ ]:
FindRoot =: adverb def 'u Nwtn ^: _'

^: is power of a function (iteration) _ is the 'number' infinity

(10 of 12) Method 2: (ctd)

Next we define the function f according to the formula developed in the article. We want to solve f(k) = 0.

We run the FindRoot procedure starting at 2, i.e. as was done with Mathematica in the article. The best fit curve is y = 1.36x^1.90 and the error criterion is 66.5 which is much better than the result of method 1. [See below]


In [ ]:
f =: verb define
k =. y  NB. y is the argument of f, renamed (local) to k
c =. ((+/vy*vx^k)%(+/vx^(2*k)))*vx^k
+/(c-vy)*c*xlg
)

In [ ]:
[k =: f FindRoot 2

In [ ]:
[a =: (+/vy*vx^k)%(+/vx^2*k)

In [ ]:
yp =: a*vx^k

In [ ]:
+/(yp-vy)^2

(11 of 12) Shortcut for Method 2:

The article solved the nonlinear least squares problem by a mixture of algebraic work and numerical computer code. The purpose of the algebra was to reduce two equations in two unknowns to one equation in one unknown.

It is possible to solve the problem directly on the computer by applying the Newton-Raphson method to the pair of equations. The only change necessary in the FindRoot program is to replace % (divide) by %. (matrix divide) in the Nwtn adverb. The details are shown below.


In [ ]:
Nwtn =: adverb def 'y - (u y)%.(u D.1 y)'

f D.1 is a vector of partial of partial derivatives f D.1 D.1 is Hessian matrix of partial derivatives


In [ ]:
f =: verb define
'a k' =. y
yp =. a*vx^k
+/(yp-vy)^2
)

In [ ]:
['a k' =: f D.1 FindRoot 1 2

(12 of 12) Some Limitations:

We have presented FindRoot as an implementation of Newtons Method. Unfortunately, Newtons Method (especially the higher dimensional version) is very sensitive to the initial approximation. For example, the statement

```['a k' =: f D.1 FindRoot 3 2```

will not converge to finite values.

We will not go into the general topic of algorithms for solving systems of nonlinear equations; but there is a simple modification of Newtons Method that helps us - we can reduce the step length of Newtons Method. This makes it slower but more stable.

Let alpha be a parameter between zero and one. For alpha=1 we have the normal step length; for alpha=0.5 the step length is halved.


In [ ]:
alpha =: 0.2

In [ ]:
Nwtn =: adverb def 'y - alpha*(u y)%.(u D.1 y)'

Search for the nonlinear solution starting at the 'linear' solution found in Method 1


In [ ]:
['a k' =: f D.1 FindRoot 3.7 1.4

End of Lab