The second is where you have the same number of equations as unknowns - solving linear equations. The first is where you have more equations than unknowns - linear regression. So, not really the same thing (except perhaps as a limiting case).
If i want to solve y=Xb, where y is a vector of observations and X a matrix of independent variables i could use ordinary least squares to estimate the vector b, giving:
b = (X^TX)^-1 X^T y
But i came across some matlab code which claims to solve by OLS but actually just solves it as a system of linear equations:
b = X^-1 y
Are these really the same thing? Doesn't OLS assume that there is some variance in y while the other just solves the system in a deterministic manner?
I've read on wikipedia that the matrix (X^TX)^-1 X^T, is also referred to as the Moore-Penrose pseudo-inverse so are they comparable depending on the rank of the X?
p.s. anyone else having trouble with latex at the moment?
The second is where you have the same number of equations as unknowns - solving linear equations. The first is where you have more equations than unknowns - linear regression. So, not really the same thing (except perhaps as a limiting case).
Yes, that's what i thought. But what is baffling me is that the system must be over-determined so i find it extremely unlikely that the latter should have any solutions, but the code does return a answer.
You are solving a different set of equations - the ones you get after you get after you find the minimum squared error. You find the sum of the squares of the errors by adding all the (y-a-bx -cz ...)^2 type terms. Then after differentiating with respect to a, b, c, ... and setting them each to zero, you get a new system of equations of the right rank. This is all done by that complicated matrix expression.
That's what i would do.
But the code i have doesn't do that:
They're solving Y = MUCode:U = pinv(M)*Y; res = norm(M*X-Y,'fro');
Yet the author calls it OLS, and a solution is returned. That's what has me confused. Does Matlab automatically perform OLS on an over-determined system perhaps?
Does pinv stand for pseudoinverse
I don't have emotions and sometimes that makes me very sad.
Yes. Apparently matlab has two pseudo-inverses - this one is the Moore-Penrose inverse.
So it's doing what one would expect.
Also about the latex issue... yeah the board has experienced some degradation in terms of features lately. It used to be that using [math] [/math] instead of [latex] [/latex] would allow you to do most latex but that isn't doing so well anymore. The search functionality is also broken. quark has said that at some point in the future he is going to upgrade the forum software so these features should be fixed when that happens but unfortunately I haven't seen quark around for some time now.
I don't have emotions and sometimes that makes me very sad.
What would you expect?
Is there any merit to the author calling it OLS?
If the system is over-determined how is matlab returning a solution? The matrix M has dimension 891x13.
Does matlab automatically perform OLS on an over-determined system?
https://en.wikipedia.org/wiki/Moore–..._pseudoinverse
Look at the applications section
Last edited by Dason; 09-20-2017 at 07:04 AM.
I don't have emotions and sometimes that makes me very sad.
Prometheus (09-20-2017)
That's what i needed, thanks. Also learnt the Euclidean norm isn't the same as the Frobenius norm.
Quick follow up - would it be difficult to constrain the solutions to sum to one?
Last edited by Prometheus; 09-20-2017 at 07:58 AM.
I always wonder what the point of imposing such constraints is. But yes you can do that in software easily enough.
I don't have emotions and sometimes that makes me very sad.
Tweet |