# Thread: Proof of the day

1. ## Proof of the day

In this thread we post one (or more if you can't wait) proof a day. I'll start by proving the that

in the linear regression model. We can write the beta estimator as

.

Then we have that

.

Where

follows from one of the assumptions of the classical linear model: the spherical disturbances assumption.

2. ## The Following 4 Users Say Thank You to Englund For This Useful Post:

anders.bjorn (11-10-2016), bryangoodrich (01-18-2014), M!ss Moon (02-21-2014), spunky (01-18-2014)

3. ## Re: Proof of the day

me likes this. most of my proofs would come from the field of psychometrics or quantitative psychology though (mostly factor analysis and stuctural equation modelling).

here i'm doing the (rather simple) proof of how the linear factor analysis model can be parameterised as a covariance structure model. it's relevant because as a linear factor model it is unsolvable, but as a covariance structure model it is possible to obtain parameter estimates.

let the obseverd score be defined as the linear factor model since it is known that (in the case of multivariate normality) it trivially follows that:

so taking the expectation of both sides:

which happens because the erros are random and assumed uncorrelated with the Factors and estimated loadings. Now by linearity of expectation and substituting the covariance matrix of the Factors and of the errors we can see that:

which is known as the fundamental equation of Factor Analysis.

4. ## The Following 2 Users Say Thank You to spunky For This Useful Post:

bryangoodrich (01-18-2014), Englund (01-18-2014)

5. ## Re: Proof of the day

Okay, since this day is soon over (at least according to Swedish time) and no one posted a proof yet today, I'll post another proof. I'll give a very simple, and possibly boring, proof this time. I'll prove that is the value that minimizes the sum (1).

By taking the first derivative with respect to a and setting it equal to zero, we get .

By checking the second order condition we see that it's equal to 2n, which is always positive, so now we know that is at least a local minimum. By investigating (1) it is easily seen that it is also a global minimum.

6. ## Re: Proof of the day

I prefer the version that doesn't require the use of calculus.

Now consider the last summation. Note that in the sum both and are constant so we can pull them out

We know that that sum is equal to 0 so this shows the third summation disappears.

We are left with

The first summation we can't control and the second sum is always non-negative so the minimum would occur if we can make it equal to 0 - which happens when .

Now clearly I need a few more details to make it more rigorous but I like that version a little bit more because it also gives hints at what we do in ANOVA when decomposing the sums of squares.

7. ## The Following 3 Users Say Thank You to Dason For This Useful Post:

bryangoodrich (01-20-2014), Englund (01-20-2014), Jake (01-19-2014)

8. ## Re: Proof of the day

a while ago (before Englund became an MVC) I posted a proof about another result in factor analysis. I thought it would be nice to resurrect it (briefly) and add it here to our small (but growing) compendium of proofs. the original thread is here

and the proof goes like this:

Let be a covariance matrix with eigenvalue-eigenvector pairs (), (), ..., (), where
. Let and define:

and:

Then, PROVE:

Spunky's attempt of a proof:

By definition of , we know that the diagonal of is all zeroes. Since
and have the same elements except on the diagonal, we know that

Since
and , then it follows that

Writing it in matrix form, this is saying where
and

Then, the following is true:

All the disappear because by the definition of we know that

9. ## The Following User Says Thank You to spunky For This Useful Post:

Englund (01-20-2014)

10. ## Re: Proof of the day

Originally Posted by spunky
a while ago (before Englund became an MVC)
Time wasn't even defined before I became MVC, so that's per definition impossible
Originally Posted by spunky
I posted a proof about another result in factor analysis. I thought it would be nice to resurrect it (briefly) and add it here to our small (but growing) compendium of proofs.

and the proof goes like this:
Very nice. If you keep posting stuff on FA I'll be forced to get more familiar with it, which is good

11. ## Re: Proof of the day

Originally Posted by Englund
If you keep posting stuff on FA I'll be forced to get more familiar with it, which is good
i don't quite understand why but pretty much NO ONE in the Statistics world even touches on Factor Analysis. when it comes to dimension reduction techniques almost all of the undergrad stats textbooks i've seen that deal with intro to multivariate analysis stop at principal components. there may be like some small subsection in some namless appendix that says something about Factor Analysis... but that's it!

WHY!??!

12. ## Re: Proof of the day

Here's a link to geometrically based proof I posted a few months ago in another thread. It is about constraints among sets of correlation coefficients.

In the thread I just call this an "argument" but if Dason's thing counts as a proof then I think mine does too

13. ## Re: Proof of the day

Nice proof! Simple and fun! =)

14. ## Re: Proof of the day

Here, it is under MATH 31 so called statistics. I am still having problem solving in this field.

15. ## Re: Proof of the day

We shouldn't let this thread get buried. I'm gonna sticky it.

16. ## Re: Proof of the day

Somebody post a proof. Go!

17. ## Re: Proof of the day

Originally Posted by Dason
Somebody post a proof. Go!
*YOU* should do one!

18. ## Re: Proof of the day

Nice thread so I make my debut here: The derivation of the Ridge-Estimator in the linear Regression Model.

with strong correlation patterns among the vectors within the data matrix . The problem with multicollinearity is that single components within the vector of parameters can take absurdly large values. So the general idea is to restrict the length of said vector to a prespecified positve real number. Let this restriction been noted by , whereas is just the euclidian norm on .

Eventually one faces the restricted least squares problem

whereas the Lagrange parameter is assumed to be positive and is the associated parameter space. The optimization problem is equivalent to

Taking the derivative with respect to yields

This leads to the first order condition (note you can set the hats already due to the fact that the potential minimizers of the problem above are already given as an implicit function)

Arranging terms leads to the modified normal equations

Since is at least positive semi definite and is positive definite one yields that*

so that is an invertible matrix even if the data matrix is of less than full column rank. This finally yields the ridge estimator in its known form

Also this is the unique global minimizer of due to the fact that the problem under consideration is just a sum auf convex functions and is the only local minimizer, so one doesn't need to check the second order conditon and the associated hessians.

*One can find a good proof for that inequality in Magnus, J.R. & Neudecker, H. (1999). Matrix Differential Calculus. Wiley and Sons on page 227 theorem 28.

19. ## The Following 2 Users Say Thank You to René For This Useful Post:

Englund (12-21-2014), spunky (09-17-2014)

20. ## Proof of the day

The Pearson product-moment coefficient of correlation can be interpreted as the cosine of the angle between variable vectors in dimensional space. Here, I will show the relationship between the Pearson and Spearman (rank-based) correlation coefficients for the bivariate normal distribution through the following series:

.

If we let , then

where it follows for ,

, so that

.

This series is uniformly convergent for all values of and for . Hence, integrating with respect to , where gives

.

Suppose that is neither zero nor a multplei of .

Then the series is convergent, and, for , , is positive, monotonic, decreasing and bounded. As such the series:

is therefore uniformly convergent on the interval .

Subsequently letting , then it follows that if is neither nor a multiple of we have

.

Setting and exponentiating gives the relationship (for large sample sizes) between the Pearson and Spearman correlation coefficients as:

for the bivariate normal distribution.

21. ## The Following 2 Users Say Thank You to Dragan For This Useful Post:

Englund (10-04-2015), spunky (02-22-2016)

 Tweet

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts