What test if no linearity

Jdijkers

New Member
Dear forum members,

I am currently struggling with my research about boardmembers and their influence on the strategy. From other articles I concluded that the best way to test this was an hierarchical regression (2 blocks, 1 with control variables the other with the independent variables).

However from my scatter graphs I do not find any linearity between my variables e.g. the percentage of females on the board and strategic change or the size of the board and strategic change. This could eventually still be the case but for now is there any other test I could run or anything I can do to continue with this analysis?

(Scatter graphs show me mostly flat lines, so no linearity).

Best Jdijkers

Dason

Can you talk about your design more? How many variables do you have? Are they continuous or categorical?

Jdijkers

New Member
Sure!

I originally had 5 independent variables: %_Caucasian, %_AfrAm, %_ Asian, %_Hispanic and %_Female. The idea behind the research (and literature) is that with a higher diversity and/or with more females in the board we could see more strategic change (which is a percentage change from last year). Because the ethnic variables are closely tied together (if %_caucasian is higher, the others are more likely to drop) I made this into a new variable based on the Gibbs-Martin Index. This variable (GM) is a continuous variable that has a value of 0 if there is only one ethnic group represented, and a higher value based on how many ethnicities are present in the board.

Furthermore I use total board size (how many people are on the board) and age (average age of the board members) as control variables because they are likely to interfere with strategic change.

So basically my model is: Strategic change = GM + %_Female + control variables

%_Female varies from 0 to 1 (continuous) as does the GM variable.

noetsi

Fortran must die
Box-Tidwell is a test of linearity in logistic regression - I assume it works for linear regression as well. Add to the model an interaction term which is the log of each continuous variable you think is non-linear times its logged from. If the original predictor variable is significant it suggests non-linearity. Note you should divide the nominal alpha by the number of predictors (including the intercept) when you do this (family wise error becomes an issue). So if you are testing 5 IV and alpha was .05 you would divide .05/11 to get the level neccessary to show signficance.

There are a wide range of transformations used to convert non-linear data to linear. Unfortunately, the exact transformation to use is not always obvious. I think, although I am not sure, that Tukey's ladder of powers will address linearity (that is make suggestions how to transform it).

Jdijkers

New Member
Unfortunately this goes somewhat beyond my statistics ability. But I will look into your comments and look up how to do this . Thank you

noetsi

Fortran must die
I modified my comments in one regard. The test I offered is geared to logistic regression - I assume it works for linear regression as well. If you are interested you should look up box-tidwel.

Dason

I think noetsi's advice is a little premature. When building a model with multiple predictors just looking at the bivariate relationship between your predictors and the response isn't always useful.

Did you try fitting the full model and looking at the residual by fitted plot?

noetsi

Fortran must die
I don't understand why you would not want to know if a specific variable is non-linear as a predictor? Are you saying that you should search for a general non-linearity between the predicted variables generally first?

Jdijkers

New Member
I am very thankful for your fast responses, however I am unluckily not that educated in terms of statistics/regressions; i only know some basics. I made the residual/fitted plots with two variables (%_Female and GM ethnic). I suppose it has something to do with the fact that my independent variable has many 0 values (a lot of boards are not diversified and/or only have male members).

Dason

I was asking for the full model where you have all of the predictors in the model simultaneously. But these are illuminating as well - why exactly do you think you don't have linearity? Things look fine to me.

You might not have a strong effect (we can't tell that from just the residual plots) but I don't see a problem with the assumption of linearity.

Jdijkers

New Member
As far as i knew (sources from the internet). Linearity is first checked by creating a scatter graph of the independent and dependent variable. As shown below:

As I understand you can't always judge this visually. Thats why I did some regression tests with my variables. All my variables (no matter what I try) give p-values of well above >0.500. So thats why I was doubting the linearity. From this I could then only conclude that the independent variables have little to no relation to the dependent variable.

Edit:

This would be the full model (two independent variables; %_female and GM ethnic) on the y-variable: Strategic change (entropy):

Dason

So it sounds like it's a terminology issue more than anything. You don't have an issue with linearity. That would be if the actual relationship between the predictor and the response looks more like a quadratic curve than a straight line - your issue is that you don't have statistical evidence that the predictor is useful in predicting the response.

Jdijkers

New Member
Just to clarify: The First graph in my last post, do you 'see' linearity. As in; would you assume linearity from a graph like that. Because it's definitely not a quadratic curve. But 'straight' would be far-fetched to say right?

GretaGarbo

Human
I originally had 5 independent variables: %_Caucasian, %_AfrAm, %_ Asian, %_Hispanic and %_Female.
In what country are you doing this investigation?

Dason

Just to clarify: The First graph in my last post, do you 'see' linearity. As in; would you assume linearity from a graph like that. Because it's definitely not a quadratic curve. But 'straight' would be far-fetched to say right?
I honestly do not see a problem with linearity. Why do you think 'straight' would be far-fetched? There's certainly variation (and possibly an outlier that might be interesting to examine) but I think it would be fine to assume linearity here.

Last edited:

Dason

I have no idea what you're talking about. It says 'do not' and has always said 'do not'. That edit I made was ... to clean up formatting. Yeah that's it.

noetsi

Fortran must die
Apparently I was in error. I read it saying do several times. Sorry about my mistake.