Bounds for R2 in OLS regression?

    Bounds for R2 in OLS regression?


    From the big dataset Iím working with, my boss is somewhat curious (read skeptic) as for why fitting a multiple linear regression with 189 predictors gives a meagre R2 of about 0.25. After exploring the variables a little bit I can see that range of the dependent variable is bounded between 1 to 10 and that a huge majority (say around 80% of the data) is concentrated on the upper end, so most people mark answers 8, 9 or 10 (mostly 9 or 10 actually) and is, therefore, severely skewed. This is a generic rating scale on the well-being of children so itís not unusual to see that the vast majority of children are doing well or extremely well. The variance, as you can expect, is pretty low (about 1.5 for the dependent variable)

    I also notice that among the many, many predictors, almost all of them are severly skewed because they ask questions about depression, anxiety, child poverty, etc. so you expect most children to score either very low or very high on those measures.

    My proposed solution:

    I remember learning about the Frechet bounds when I was doing my MA and how the skewness/kurtosis of the data places bounds on the attainable range of the Pearson correlation, so that non-normally distributed variables may not reach a correlation beyond a certain bound. Since the multiple correlation coefficient R is some version of cor(y,y-hat), would it make sense for me to use these bounds to demonstrate that R2 will never be able to get beyond a certain upper bound, given the extreme skewness of the data? Would this logic be sound?
    Re: Bounds for R2 in OLS regression?

    Is there any other issue since you are using ordinal data?
