- Thread starter Danica Blanche
- Start date
- Tags alpha scale development

and for a scale meant only for descriptive purposes

acceptable alpha for a 3-item scale? Would, say, .45 suffice?

Care to expand, you're certainly much more knowledgeable about psychometrics than I.

I was wondering if an alpha level deemed to be unacceptable (<.5) could be acceptable in this case, and if it is, how to determine the acceptable alpha level taking into account the number of items.

Alpha is related to the number of item, in that as you increase the number of items, alpha increases.

Alpha of .45 means that only 45% of scores are explained by variation in variable you are measuring (i.e., true score variance), 55% is error. The problem with low reliability is that it damages the validity of the measurement and attenuates relationship you are studying (makes them closer to 0).

The rule of thumb is that you want to have alpha of AT LEAST .70. Ideally you want something at about .90.

Alpha is related to the number of item, in that as you increase the number of items, alpha increases.

Alpha is related to the number of item, in that as you increase the number of items, alpha increases.

so... here's the problem. although i think you raise a valid point on inquiring about the reliability of small scales, i don't think it's very meaningful to talk about alpha in cases like this. 3 items are just barely enough to tap on your construct of interest (from a Structural Equation Modelling point of view) so i'm siding with trinker on this one and you should be cautious.

HOWEVER... now you've left me wondering. factor loadings of 0.99 with a decent sample size of 100 and the most alpha gets to is 0.5?? maybe you are right. maybe studying further the reliability of small scales is worthwhile.

i'll add it on my to-do list, heh.

1) Do we really care about "true scores"? Despite the name, a person's true score is

2) Even if we did care about true scores and true score variance, alpha is not a very good estimate of the proportion of true score variance anyway, since it (usually unrealistically) assumes the measure is essentially tau equivalent, with no correlated measurement error across items.

Useful article: On the Use, the Misuse, and the Very Limited Usefulness of Cronbach’s Alpha

People tend to always report and worry about alpha because A) it doesn't require you to obtain any other information beyond responses to the test; and B) It's available in SPSS. But there are plenty of other ways to assess the reliability and validity of a test. So maybe the solution for the OP could be to consider the psychometric quality of the test from other perspectives. E.g., can you show evidence for content validity? Convergent and discriminant validity? What have other studies found out about this test? Etc.

The problem with low reliability is that it damages the validity of the measurement and attenuates relationship you are studying (makes them closer to 0).

But there are plenty of other ways to assess the reliability and validity of a test.

i do get why from a theoretical standpoint one would not talk much about reliability when there's such a limited number of items on a scale or subscale. reliability indices are, after all, trying to get to this idea of consistency over repeated measurements and if you only have 3 measurement instances then how consistent could your answers be? nevertheless, there are times where all you have are... well... 3 items and that's it, lol.

this seems like a common-enough problem that someone would've done some research on it but i can't seem to find anything

The opposite can occur, if correlated measurement error across variables is present. There is a simple simulation showing that in this article by TalkStatters.

ALTHOUGH i do give you brownie points for your shameless self-promotion. i like that

i do get why from a theoretical standpoint one would not talk much about reliability when there's such a limited number of items on a scale or subscale. reliability indices are, after all, trying to get to this idea of consistency over repeated measurements and if you only have 3 measurement instances then how consistent could your answers be?

nevertheless, there are times where all you have are... well... 3 items and that's it, lol.

ALTHOUGH i do give you brownie points for your shameless self-promotion. i like that

I think that's exactly it, really. Low reliability estimates for short tests probably aren't so much a reflection of a problem with the reliability estimate we use, so much as something that's pretty much built into how we define reliability.

Personally I'd kinda edge toward saying that validity is what really matters, so what is the evidence for validity like?

and i think now we're, once again, at the point where there are as many definitions of validity as authors out there, and they do not necessarily encompass each other. how are we going to "edge" towards validity if we don't even know what it is? :/

how are we going to "edge" towards validity if we don't even know what it is? :/

(But to be fair even if we magically all agree on what validity is, how do we know whether we have it grasped in our sticky little fingers?)