10.2.7 Least-squares polynomial approximation
The fitpoly command is used
for replacing tabular data or a function by a polynomial.
Smoothing data with polynomial regression.
fitpoly can be used for finding a polynomial that fits
a given tabular data.
-
fitpoly takes one mandatory argument and two optional arguments:
-
data, a two-column matrix or a sequence of two lists,
representing the x- and y-values of data points
(x0,y0),(x1,y1),…,(xm,ym).
- Optionally, x[=a..b] or a..b,
which is a variable x and optionally its domain [a,b] or a segment [a,b]
(by default, unset).
- Optionally, a sequence of options, each of which may be one of:
-
limit=N, a nonnegative integer,
which is the maximal degree for the resulting polynomial (by default, N=15).
- degree=d, where d is a nonnegative integer,
which is the desired degree of the resulting polynomial (by default, unset).
- threshold=tol, a positive real number,
which is used for finding an appropriate fitting degree,
as described below
(by default, threshold=0.01, ignored when d is set).
- length=l, a positive integer,
which is used for finding an appropriate fitting degree,
as described below (by default, length=5, ignored when d is set).
- If d is set, then fitpoly returns the
polynomial of degree min{d,N} which best fits the data
in the least-squares sense.
- If d is not set,
then a polynomial pn of modest degree n≤ N but a good error suppression
(thus representing the trend of the data)
is chosen using tol and l
such that raising the degree does not make a significant improvement to
the approximation. Precisely, n is the first nonnegative integer such that
stddev[σn2,σn+12,…,σn+l−12] |
|
mean[σ02,σ12,…,σn+l−12] |
| ≤tol, |
where σk2 is the sum of squared residuals ∑i=0m(yi−pk(xi))2
and pk is the best-fitting polynomial of degree k. Ideally, n is the
smallest degree for which σn2≈σn+12≈σn+22≈⋯, meaning that
higher-order polynomials would overfit on the noise.
- Lowering the parameter length, as well as increasing
threshold, would make the algorithm more greedy,
effectively lowering the degree of the output polynomial. Note that when
degree is set, these two parameters are ignored.
- If no variable is specified, then fitpoly returns the list of
polynomial coefficients.
- If a segment [a,b] is given, then only the data points (xk,yk) for which
a≤ xk≤ b are considered.
Example
We find a polynomial which best approximates the data infected by noise.
The data is produced by evaluating the polynomial f(x)=1/100x4−1/5x3+6x
at 100 points between x=1 and x=20.8. The noise is generated by a random,
normally-distributed variable.
f(x):=x^4/100-x^3/5+6x:;
x:=[(1+0.2*k)$(k=0..99)]:;
noise:=randvector(100,randvar(normal,mean=0,stddev=10)):;
y:=apply(f,x)+noise:;
fitpoly(x,y) |
|
[0.011665,−0.266955,0.844543,2.44967,3.94846]
| | | | | | | | | | |
|
We obtain the list of polynomial coefficients, starting with the leading coefficient.
Note that the polynomial of degree 4 is returned (it has five coefficients),
which is, in this case, optimal for data smoothing.
When approximating only the data in segment [1,10], we obtain a polynomial of
9th degree (the data curvature is now much smaller and the noise is more prominent).
degree(fitpoly(x,y,1..10)) |
To make the approximating polynomial less sensitive to noise, we increase the threshold
value tol.
fitpoly(x,y,t=1..10,threshold=0.05) |
|
−1.30984955412 t2+9.3259443857 t−2.6482979067
| | | | | | | | | | |
|
Alternatively, we could set the parameter length to a smaller value, e.g. 3.
Approximating functions by polynomials.
fitpoly can be used for approximating a continuous function
f:[a,b]→ℝ by a polynomial pn of certain degree n which
minimizes the error in the sense of L2-norm.
-
fitpoly takes two mandatory arguments and one optional argument:
-
f(x), which is an univariate expression representing the function f.
- An interval a..b, which is the domain [a,b] of f.
- Optionally, a sequence of options, each of which may be one of:
-
limit=N, as before.
- degree=d, as before.
- ε or threshold=ε,
which is a positive real number such that for the resulting polynomial pn
the following holds:
||f−pn||2= | ∫ | | (f(x)−pn(x))2dx≤ε |
(by default, ε=10−8). This parameter is ignored when d is set.
- fitpoly finds pn with smallest n≤ N such that
||f−pn||2≤ε. If such n does not exist, then pN is returned.
- If the parameter d is set, then pd which minimizes ||f−pd|| among all
polynomials of degree d is returned.
- Polynomial approximation is fast and robust even for large d and N resp. for
small ε.
Examples
fitpoly(cos(x),0..pi/2,1e-2) |
|
−0.244320054366 x3−0.336364730985 x2+1.13004492821 x−0.0107731059645
| | | | | | | | | | |
|
f:=exp(-7x)*5x:;
g:=fitpoly(f,0..1,degree=8) |
| | −21.717636069 x8+107.930784832 x7−232.831655404 x6 | | | | | | | | | |
| +286.778708741 x5−222.236631985 x4+111.004732684 x3 | | | | | | | | | |
| −33.8769709361 x2+4.95239715728 x+0.000500886757642
| | | | | | | | | |
|
The mean absolute error of the above approximation can be estimated as follows:
sqrt(romberg((f-g)^2,x=0..1)) |