# Likert scale analysis - ordinal or interval??

#### indie92

##### New Member
Hi, I would appreciate some help so much as I feel I am going round in circles with every journal article I read!

So I am currently trying to analyse the survey responses for my dissertation based on a 5 point likert scale.
I gave surveys to two groups, a control and an intervention group, general aim is to see if intervention group have a more positive attitude. Survey had 6 questions, 4 positive 2 negative.

I have coded my data and reverse coded the negative, and input it to spss.

Now this is my confusion. Based on what I have read I am well aware of the controversy surrounding ordinal vs interval for likert data. But I feel that my data may be best treated as interval as I am trying to determine a general opinion or satisfaction rather than one concrete thing, from a series of questions.
So I have calculated a total score for each participant i.e 30 = very positive attitude 6= very negative attitude.
My plan now was to run some descriptive statistics comparing frequency and means. Then move on to some parametric tests, unsure what is best to use yet though.

Am I on the right lines with this method??

#### noetsi

##### Fortran must die

This issue comes up over and over again. If you have 12 plus distinct levels (which I think you do) and you can reasonably assume that the difference between all levels are the same then you can consider your data interval [of course there is no agreement on that point, that is what I have decided after reading the literature]. If you can't do that then you can't calculate a mean for descriptives or run parametric statistics like regression.

#### GretaGarbo

##### Human
My plan now was to run some descriptive statistics comparing frequency and means. Then move on to some parametric tests,
I would do that too.

I like that you plan to do histograms (You can also do boxplots and QQ-plots if you know what that is.)

I would compute means and do parametric test on on a single Lickert item.

(As said, this is controversial issues, and frankly, I would ignore the statements in post 2.)

#### indie92

##### New Member
Thanks for your reply. I am slightly confused by the 12 distinct levels, what exactly is this?

#### indie92

##### New Member
Thanks, so when I run parametric tests do you think I should do them on each individual question or on the overall score I generated for each participant?

#### noetsi

##### Fortran must die
Thanks for your reply. I am slightly confused by the 12 distinct levels, what exactly is this?
Say you measure something from 1 to 100 in a variable. However many times these values are replicated there are still only 100 distinct levels. That is what I was talking about. For a dummy variable there are two distinct levels (usually coded 1 and 0). How many such different levels there are is central to what method you can use in practice.

#### GretaGarbo

##### Human
Thanks, so when I run parametric tests do you think I should do them on each individual question or on the overall score I generated for each participant?
I suggest that you do tests on what you are interested in!

Since you seems to be interested in the in the summed scale I would do statistical tests on that. But if you are interested in the individual items then I suggest to test that too. I can't see anything statistically wrong with testing the individual items.

As I just try to guide indie92, I suggest to ignore the statement below.

For a dummy variable there are two distinct levels (usually coded 1 and 0). How many such different levels there are is central to what method you can use in practice.
For me it is perfectly acceptable to take the mean or the sum of Noetsis and my weight, although that is just two "distinct values". It is more a matter of what kind of scale it is.

#### noetsi

##### Fortran must die
Actually GretaGarbo I was commenting on what type of scale in the context of linear regression (which requires interval data for the dependent variable). Comments on what it means to be interval in the context of linear regression focus on how many distinct levels the variable takes on. A rule of thumb is at 12 or more distinct levels you can treat ordinal data as "interval like" and thus run linear regression on it. Obviously this is not the formal definition of interval, it is one based on experience and (I assume) simulations.

A second requirement is that the distances between the levels have to be the same.

#### GretaGarbo

##### Human
Comments on what it means to be interval in the context of linear regression focus on how many distinct levels the variable takes on. A rule of thumb is at 12 or more distinct levels you can treat ordinal data as "interval like"

(If there is for example an ordinal variable on a continuous scale between 0 and 100 like a VAS scale (Visual analog Scale). Then the continuity makes that the scale to have "many" levels. But that does not make the scale an interval scale. An other example is the economists utility variable, that is continuous and ordinal. (The first derivative exists but not the second.) The continuity does not make it an interval scale or a ratio scale. It is still ordinal. And the economist emphasize that.)

#### noetsi

##### Fortran must die
Its of course not mine It is what I have seen in journals. This is one area where I have not seen any author disagree and (this came up on my master's defense) my committee agreed on this assessment as best I can remember (or did not disagree). Three stats professors can't be wrong

it is not a question if it is interval in a theoretical sense. Its a question of whether it violates the assumptions of linear regression so that you have to use an approach like logistic regression instead of linear regression. That is what we are talking about not the theory of scales. When you have enough levels the method works, when you don't it does not.

That is why they call it "interval like."

#### GretaGarbo

##### Human
Three stats professors can't be wrong
But your memory can be wrong.

That is what we are talking about .....
That is what you are talking about, not me.

I am just trying to guide one of our readers.
To take an 5-point Lickert scale and compute a mean of that, is very natural for most people. And then to compare the mean from an other group and do a parametric test on that is also very natural to most people, including me.

When you have enough levels the method works, when you don't it does not.
I just gave you an example of two weights that in my view is OK to sum.

#### noetsi

##### Fortran must die
It is what the thread is on as I understand it. They want to know if they can do parametric tests with likert data [go back to the first post of the thread]. The answer to that, using regression as example, is that it depends on the number of distinct levels not if the data is interval or not. If you have two levels of the DV you can not use linear regression. If you have 12 or so you can.

They were not asking if the data was interval or not. They were asking if they could use parametric test. Are you saying that you can run a parametric test which has only two possible answers say 1 and 0?

#### CB

##### Super Moderator
Actually GretaGarbo I was commenting on what type of scale in the context of linear regression (which requires interval data for the dependent variable). Comments on what it means to be interval in the context of linear regression focus on how many distinct levels the variable takes on.
OLS regression does not directly assume that the dependent variable is interval. The assumptions for OLS regression are (refer our article):

1. Error terms have mean zero (for any combination of predictor values)
2. Error terms are independent
3. Error terms have identical variance (for any combination of values of the predictors)
4. Error terms have a normal distribution

Nothing in there about the measurement level of the data. If we were to be subtle about it, you could say that non-quantitative variables are not going to have relationships that take the form of smooth functions, and consequently assumption 1 will be breached, but that's maybe a longer discussion for another day. The objection to using ordinal data in parametric analyses isn't about assumption breaches, it's about the meaningfulness of the results (see SS Smith, 1946).

If you have 12 plus distinct levels (which I think you do) and you can reasonably assume that the difference between all levels are the same then you can consider your data interval
As Greta alludes to, you are confusing the issue of whether the data is continuous from whether it is interval. These are two completely different concepts. A variable can be continuous but not interval and vice versa. An "interval" scale is a measurement concept, not a distributional one. Note that OLS regression assumes normal errors, so implicitly assumes a continuous DV. The errors may still be reasonably approximated by a normal distribution if the DV is actually discrete and yet has a large number of possible values. But this does not mean it magically shifts from ordinal to interval. (I realise that there are published claims floating around contradicting this, but they are flat out categorically wrong ).

If it'd be useful for you or anyone else for me to write an explanation of what "interval" means, or why measurement theorists sometimes object to parametric tests with ordinal data, feel free to ask

#### noetsi

##### Fortran must die
I actually was not commenting on if its continuous or interval. I was commenting on if you could use it in a specific test. Which is all I really know about [if that] There are a fair number of articles that discuss how many unique levels you need to run linear regression. This specifically came up at my master's defense.... They discuss this not in the context of the theory of numbers, but practical errors.

Obviously a fair number of people in journals consider interval data to be a requirement for linear regression thus the term "interval like" Its entirely possible, I have often found this to be the case here, their usage is wrong. I simply repeat what I read I am sure that this is tied to the assumption of normality (and possibly equal error variance).

In any case some parametric test will not, according to the journals I have read, work with a limited number of unique values. Linear regression is certainly one, which is why logistic regression is used. I think that is what the original poster is actually interested in, not if their data is formally interval or continuous. Its whether they can use a specific test.

I also note in passing that I had always considered test such as logistic regression to be non-parametric. But there clearly are disagreements about that based on my review today.

#### CB

##### Super Moderator
I actually was not commenting on if its continuous or interval. I was commenting on if you could use it in a specific test. Which is all I really know about [if that]
Yeah, sure, but to know if you can use it with a specific test you need to understand what the assumptions of that test are and aren't, and why people might object to using a particular kind of data with the test.

#### noetsi

##### Fortran must die
Well the argument made by those who stress specific levels is that in practice, if not theory, it is the number of levels that determines if you violate the assumptions not if the data is truly continuous or not. Or rather you don't violate the assumption if your data is not continuous, but has enough distinct levels.

It is well known that having two levels of a DV violates normality and [commonly] equal error variance. That is why you use logistic rather than linear regression for such data. But at a certain point even data that is not continuous will be close enough to it to not violate the assumptions enough to matter. Or so many say, and I spent a lot of time looking for articles on this since I work with likert data a lot. It is common to call this type of data "interval like"

I am not wise enough to know if that is true, but I have seen it in enough journals to accept it. If I ever have time to simulate enough data maybe I can test it. But of course these articles are based on simulations anyhow.

#### GretaGarbo

##### Human
Are you saying that you can run a parametric test which has only two possible answers say 1 and 0?
Yes! One such model is logistic regression. If p is the population probability of "1" then the parameters a and b are estimated in the model:

log(p/(1-p) = a + b*x

Isn't it very clear that there are two unknowns parameter (a and b), unknown constants that are to be estimated. And that test of the parameters are parametric tests. (Usually after the model has ben estimated with ML - maximum likelihood and then tested with a likelihood ratio test. )

I have already given the example of weight of two persons. I would say that it makes sense to compute the mean or the sum of two persons weight, when they are to enter an old-fashioned elevator where there is an maximum weight. But Noetsi maybe think of such a measure as totally meaningless since it is based on just two distinkt levels?

Yes, the original poster asks if it it OK to use parametric tests. I have said yes to that question.

#### CB

##### Super Moderator
Well the argument made by those who stress specific levels is that in practice, if not theory, it is the number of levels that determines if you violate the assumptions not if the data is truly continuous or not.
Ok, this is reasonable, though I'd change that to "if you violate the assumptions to a degree sufficient to cause substantial problems".

But at a certain point even data that is not continuous will be close enough to it to not violate the assumptions enough to matter. Or so many say, and I spent a lot of time looking for articles on this since I work with likert data a lot. It is common to call this type of data "interval like"
Again, it might be common for people to call this data "interval like", but this is simply wrong. This isn't really a matter of legitimate debate - it's an incorrect statement based on people not understanding what "interval" means. Whether a variable is continuous (or has enough levels to reasonably be treated as continuous) and whether or not it is interval are two separate issues.

(I agree overall that using parametric tests with Likert data is probably ok, but misusing the term "interval" here just adds to the gigantic confusion about this issue, so please just be careful how you use the word).

#### GretaGarbo

##### Human
Well the argument made by those who stress specific levels is that in practice, if not theory, it is the number of levels that determines if you violate the assumptions not if the data is truly continuous or not.

Ok, this is reasonable, though I'd change that to "if you violate the assumptions to a degree sufficient to cause substantial problems".
But I don't think that it is reasonable.
(And I don't think that C-Bear really thinks that it is reasonable. I think he just let it pass in the flow of discussion.)

Consider this example: Suppose I construct an interview question with 12 levels (distinct points) where the 10 first is like: totally extremely dis-satisfied, very extremely dis-satisfied, very much dis-satisfied.....
then on the 11:th and 12:th I switch to: very much satisfied, extremely satisfied.

Now I put numbers on this, 1, 2, 3, .... 10, 11, 12.
The 10 first number would be about the dis-satisfied and 11 and 12 about the satisfied.

That will have 12 distinct levels. So according to Noetsi and "a fair number of articles" that will be OK to use regression on.

I would say that it is not the number of levels that matters. It is if the numbers corresponds the the verbal phrasing.

But on a Lickert item I would find it OK to compute parameters like the mean and standard deviation and to do parametric tests.

#### CB

##### Super Moderator
noetsi said:
Well the argument made by those who stress specific levels is that in practice, if not theory, it is the number of levels that determines if you violate the assumptions not if the data is truly continuous or not.
But I don't think that it is reasonable.
(And I don't think that C-Bear really thinks that it is reasonable. I think he just let it pass in the flow of discussion.)
Ha. Ok ok. More honestly, I would say that it:
1. The assumption of continuous normally distributed errors is obviously violated with a Likert DV regardless of the number of levels
2. But it is reasonable to suggest that the number of levels - along with the sample size - is an important determinant of whether the sampling distribution of the coefficients takes something close to the assumed normal shape
3. Yet, as usual, there are more important things to worry about than the normal-errors assumption
4. The case that Greta mentions is one of them: An odd choice of response-option labelling could lead to an especially marked case of non-linear relationship between the (hypothesised) underlying latent variable "Satisfaction" and the observed item responses.
5. This might in turn lead to the predictors having a non-linear relationship with the observed DV (even if the true relationship between the latent variables we actually wished to measure was linear). This would likely lead to a breach of the assumption that the error terms have mean zero for any combination of predictor values, and lead to biased estimates.
6. Related question to ponder: Can we actually obviate this problem by using better option wording, or will it always occur with Likert data?