Noisy regression

Real-world data is noisy. When performing an experiment, this data was measured.

If we are expecting the result to be a parabola of the form y = a x 2 + b x + c , y = a\cdot x^2 + b\cdot x + c, find a , b , a, b, and c c that minimize the mean-squared error.

Enter the answer as a + b + c . a + b + c.


Note: There are tools that make this job automagically like origin, but we are here for the learning purpose and it would be a good exercise to implement this at least once to understand how those tools work.


The answer is 14.021535716869955.

This section requires Javascript.
You are seeing this because something didn't load right. We suggest you, (a) try refreshing the page, (b) enabling javascript if it is disabled on your browser and, finally, (c) loading the non-javascript version of this page . We're sorry about the hassle.

2 solutions

Mark Hennings
Nov 23, 2018

If we want to minimize F ( a , b , c ) = j = 1 n ( y j a x j 2 b x j c ) 2 F(a,b,c) \; = \; \sum_{j=1}^n\big(y_j - ax_j^2 - bx_j - c)^2 then we require F a = F b = F c = 0 \frac{\partial F}{\partial a} \; = \; \frac{\partial F}{\partial b} \; =\; \frac{\partial F}{\partial c} \; = \; 0 and hence ( E [ x 4 ] E [ x 3 ] E [ x 2 ] E [ x 3 ] E [ x 2 ] E [ x ] E [ x 2 ] E [ x ] 1 ) ( a b c ) = ( E [ x 2 y ] E [ x y ] E [ y ] ) \left(\begin{array}{ccc} E[x^4] & E[x^3] & E[x^2] \\ E[x^3] & E[x^2] & E[x] \\ E[x^2] & E[x] & 1 \end{array}\right)\left(\begin{array}{c} a \\ b \\ c \end{array}\right) \; = \; \left(\begin{array}{c} E[x^2y] \\ E[xy] \\ E[y] \end{array} \right) where E [ x m ] = 1 n j = 1 n x j m 1 m 4 E[x^m] \; = \; \frac{1}{n}\sum_{j=1}^nx_j^m \hspace{2cm} 1 \le m \le 4 is the average value of x m x^m for the data, and E [ x m y ] = 1 n j = 1 m x j m y j 0 m 2 E[x^my] \; = \; \frac{1}{n}\sum_{j=1}^m x_j^m y_j \hspace{2cm} 0 \le m \le 2 is the average value of x m y x^my for the data (note that n n is the number of data points - in this case, n = 10000 n=10000 ).

Calculating these expected values can be done easily in Excel, and simple Gaussian elimination solves the equation, giving a = 1.003089322 a = 1.003089322 , b = 3.132961855 b = 3.132961855 , c = 9.88548454 c = 9.88548454 , obtaining a + b + c = 14.02153572 a+b+c = \boxed{14.02153572} .

Hello sir, very nice solution, though in the last equation, shouldnt the letter above the sigma be n instead of m?

Mehdi K. - 2 years, 1 month ago
João Areias
Nov 22, 2018

It is well known that if we have 3 points we can define a parabola with the following matrix

[ y 0 y 1 y 2 ] = [ x 0 2 x 0 1 x 1 2 x 1 1 x 2 2 x 2 1 ] [ a b c ] \left[\begin{matrix} y_0 \\ y_1 \\ y_2 \end{matrix}\right] = \left[\begin{matrix} x_0^2 && x_0 && 1 \\ x_1^2 && x_1 && 1 \\ x_2^2 && x_2 && 1 \\ \end{matrix}\right] \cdot \left[\begin{matrix} a \\ b \\ c \end{matrix}\right]

or more commonly written as A x = b A\vec x = \vec b where

A = [ x 0 2 x 0 1 x 1 2 x 1 1 x 2 2 x 2 1 ] , x = [ a b c ] , b = [ y 0 y 1 y 2 ] A = \left[\begin{matrix} x_0^2 && x_0 && 1 \\ x_1^2 && x_1 && 1 \\ x_2^2 && x_2 && 1 \\ \end{matrix}\right] \text{, } \vec x = \left[\begin{matrix} a \\ b \\ c \end{matrix}\right] \text{, } \vec b = \left[\begin{matrix} y_0 \\ y_1 \\ y_2 \end{matrix}\right]

and we can calculate x \vec x by computing x = A 1 b \vec x = A^{-1} \vec b .

Well, we do have three points but we have even more than 3 points but, because the data is noisy, is not likely that any 2 triplets will fit to the same parabola, and if we are to include all points, our matrix A A is not square, so we can't invert it. It turns out that there is a notion of a pseudo-inverse that can be applied to m × n m \times n matrix and when applied it has the nice property of minimizing the squared error and it is defined by ( A T A ) 1 A T (A^T A)^{-1} A^T so we can compute x \vec x by computing x = ( A T A ) 1 A T b \vec x = (A^T A)^{-1} A^T \vec b . Here is my python implementation of it with the plot of the data and the regression line:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
from numpy.linalg import *
import matplotlib.pyplot as plt
import numpy as np

x, y = np.loadtxt("data.txt").T

A = np.array([x**2, x, x**0]).T
pseudo_inv = np.matmul(inv(np.matmul(A.T, A)), A.T)
regres = np.matmul(pseudo_inv, y)

f = np.poly1d(regres)
_y = f(x)

print(regres)
print(sum(regres))

plt.plot(x, y, 'ro', markersize=1)
plt.plot(x,_y, 'b-', markersize=1)
plt.show()

It is worth noting that, with your matrices A = ( x 1 2 x 1 1 x 2 2 x 2 1 x n 2 x n 1 ) b = ( y 1 y 2 y n ) A \; = \; \left(\begin{array}{ccc} x_1^2 & x_1 & 1 \\ x_2^2 & x_2 & 1 \\ \cdots & \cdots & \cdots \\ x_n^2 & x_n & 1 \end{array}\right) \hspace{2cm} b \; = \; \left(\begin{array}{c} y_1 \\ y_2 \\ \cdots \\ y_n \end{array} \right) then n A T A = ( E [ x 4 ] E [ x 3 ] E [ x 2 ] E [ x 3 ] E [ x 2 ] E [ x ] E [ x 2 ] E [ x ] 1 ) n A T b = ( E [ x 2 y ] E [ x y ] E [ y ] ) nA^TA \; = \; \left(\begin{array}{ccc} E[x^4] & E[x^3] & E[x^2] \\ E[x^3] & E[x^2] & E[x] \\ E[x^2] & E[x] & 1 \end{array}\right) \hspace{2cm} nA^Tb \; = \; \left(\begin{array}{c} E[x^2y] \\ E[xy] \\ E[y] \end{array} \right) and so your method is in fact the same as mine. This explains why your method works.

Mark Hennings - 2 years, 6 months ago

Log in to reply

Pretty much, I loved your approach to it but yeah, they are in fact equivalent.

João Areias - 2 years, 6 months ago

0 pending reports

×

Problem Loading...

Note Loading...

Set Loading...