Significant p-value, but 95%CI includes 0

grey

New Member
#1
Hello everyone,

I have the following problem:
The result of a Pearson correlation (sample size = 177) was significant, r = .127, p = .048 (one-tailed).
However, the 95% confidence interval includes zero, 95% CI [-.02, .27].

How is it possible to have a significant p, when the CI includes zero?

I'll be grateful for any help.
Thanks for your consideration.
 

maartenbuis

TS Contributor
#2
That is because a confidence interval is typically two-tailed, and your test is one-tailed. What you have found is that a one-tailed test has more power than a two-tailed test. However, it is now up to you to substantively justify why a one-tailed test is appropriate for your problem.
 

TheEcologist

Global Moderator
#3
Hello everyone,

I have the following problem:
The result of a Pearson correlation (sample size = 177) was significant, r = .127, p = .048 (one-tailed).
However, the 95% confidence interval includes zero, 95% CI [-.02, .27].

How is it possible to have a significant p, when the CI includes zero?

I'll be grateful for any help.
Thanks for your consideration.
The reason for this is the "one-tailed" comment behind your significance value.
The p-value is computed against a threshold corresponding to the one direction of interest, and as a results the significant threshold is a little bit further away from 0. In contrast your CIs are calculated with regard to both directions, and the interval is wider.

Look at this page for details;
http://en.wikipedia.org/wiki/One-_and_two-tailed_tests
 

grey

New Member
#4
Many thanks to both of you for your prompt replies! I see the the point with one- and two-tailed testing :yup:

This raises a follow-up question: When I report the 95%CIs (normally for effect size measures as r) and the p-value is reported one-tailed (due to a directional hypothesis), is it legitimate to report the confidence interval in that manner (as I did for the example above) at all? Or is it nonsense due to the problem of mixing up one- and two-tailed tests?

Many thanks in advance for your help.
 

Karabiner

TS Contributor
#5
Simply don't perform one-sided tests.
That your working hypothesis is directional
doesn't justify to make yourself willingly
blind for possible results into the other
direction than assumed.

With kind regards

K.
 

TheEcologist

Global Moderator
#6
Simply don't perform one-sided tests.
That your working hypothesis is directional
doesn't justify to make yourself willingly
blind for possible results into the other
direction than assumed.
There are many many instances where the other direction is of no interest whatsoever. Researchers should decide themselves based on logic and knowledge of the system whether a one side hypothesis is valid. Saying that one should never conduct a one-sided test is just another way of not thinking about your problem, albeit a more conservative one.
 

grey

New Member
#7
Thank you for your inputs. Karabiner, I just started thinking about this question - whether or not two-tailed tests should be used as a standard - myself, and I surely will continue to do so.

For the work at hand, however, I don't want to change the procedure. I comment on the effect emphasizing the small effect size, which is too small to be of relevance (especially in the context at hand) anyway.

But the question remains - at least for me: Is it justified to report the normal 95%CIs when testing one-tailed or is it just nonsense? (And if so, can you think of an alternative approach? (apart from testing two-tailed ;) )
 
#8
Is it justified to report the normal 95%CIs when testing one-tailed
No, it is not justified to use the usual 95%CI, which would create an upper and a lower confidence limit.

But of course you can have a 95% confidence level on a a one-sided interval.

For the work at hand, however, I don't want to change the procedure. I comment on the effect emphasizing the small effect size, which is too small to be of relevance (especially in the context at hand) anyway.
If you only use a one-sided test (a one sided confidence interval) the result would be that the confidence interval would go from (something like) +0.02 up to 1.0, since there is no rejection of high values. Does it look reasonable to present it like that?
 

grey

New Member
#9
If you only use a one-sided test (a one sided confidence interval) the result would be that the confidence interval would go from (something like) +0.02 up to 1.0, since there is no rejection of high values. Does it look reasonable to present it like that?
I'm afraid I didn't get that point yet. What you're saying is that the confidence interval should be "open" to one side, aren't you? But what I intend to achieve when calculating confidence intervals (in most cases for effect sizes) is to get a better idea of the range the effect size probably is in (e.g. are only medium effects covered or does it range from small effects up to large ones?). I feel that this can be achieved better when having upper and lower boundaries. Apart from this the "problem" remains that the CI still includes zero. I first thought to the CIs should be increased, e.g. 97.5% or something, but then I became aware that this would even further broaden the range... But I'm very interested to better understand the point you make, perhaps you could elaborate on it? I'd be very grateful :)
 

spunky

Doesn't actually exist
#10
But what I intend to achieve when calculating confidence intervals (in most cases for effect sizes) is to get a better idea of the range the effect size probably is in (e.g. are only medium effects covered or does it range from small effects up to large ones?).
if this is your immediate goal then you're not really doing it right.

the regular, easy-cheezy good ol' confidence intervals we all know and love come from a central distribution under the null hypothesis. effect sizes do not follow a central distribution, they follow a non-central distribution. for example, Cohen's d follows a non-central t-distribution and, therefore, building a confidence interval around Cohen's d requires that distribution.

it's usually just a matter of converting the effect size statistic into a non-centrality parameter and find a confidence interval on the non-centrality parameter... so not too horrible for simple effect sizes (but a lot more cumbersome for complicated stuff).
 
#11
Thanks a lot for this info, spunky! I found an article on the topic, which helped me to (somewhat) better understand the point you made (for others who may be interested too: http://www.uvm.edu/~dhowell/methods7/Supplements/Confidence Intervals on Effect Size.pdf).

So, for CIs for means, central distributions are used. Cohen's d, as it is an effect size, should follow a non-central distribution. But why is that the case? Are there any rules when something is central or non-central in its distribution? (Of course, I'm especially interested in Pearson's r as an effect size measure ;) )


P.S. I've just controlled the results for the 95% CIs and got the same results as with the syntax I'd used before - it seems as if it took the non-centrality distribution into account.
 
Last edited:

spunky

Doesn't actually exist
#12
So, for CIs for means, central distributions are used. Cohen's d, as it is an effect size, should follow a non-central distribution. But why is that the case? Are there any rules when something is central or non-central in its distribution? (Of course, I'm especially interested in Pearson's r as an effect size measure ;) )
ALL confidence intervals that you know and use in regular hypothesis testing rely on central distributions. in other words, it implies that the effect size = 0 in the population. usually, we don't use effect sizes because we think they're zero in the population so you always use the non-central distribution for confidence intervals for effect sizes (unless you assume them to be 0).


P.S. I've just controlled the results for the 95% CIs and got the same results as with the syntax I'd used before - it seems as if it took the non-centrality distribution into account.
uhm... weird. how are you obtaining the non-centrality parameter?
 
#13
ALL confidence intervals that you know and use in regular hypothesis testing rely on central distributions. in other words, it implies that the effect size = 0 in the population. usually, we don't use effect sizes because we think they're zero in the population so you always use the non-central distribution for confidence intervals for effect sizes (unless you assume them to be 0).
Okey, that is: When I use Pearson's r as a regular correlation coefficient to test a hypothesis I use the central distribution (what SPSS normally does). If, however, the coefficient was significant and I want to say something about the CI of r as an effect size I use the noncentral distribution, isn't it?


uhm... weird. how are you obtaining the non-centrality parameter?
For the first calculations I used a SPSS-Syntax by Andy Field (if you want, I can post it). For checking the results, I used ESCI (Exploratory Software for Confidence Intervals), which can be downloaded on this website: http://www.latrobe.edu.au/psy/research/cognitive-and-developmental-psychology/esci
I assumed that this sheet works with non-centrality distribution, as the whole thing is about that - as far as I understood it...
I've ordered the book in the library and I hope, it will help me to better understand, what I am doing or what I am supposed to do ;)
 

spunky

Doesn't actually exist
#14
Okey, that is: When I use Pearson's r as a regular correlation coefficient to test a hypothesis I use the central distribution (what SPSS normally does). If, however, the coefficient was significant and I want to say something about the CI of r as an effect size I use the noncentral distribution, isn't it?
that is correct



I assumed that this sheet works with non-centrality distribution, as the whole thing is about that - as far as I understood it...
this may (or may not) be true... and i'm inclined to believe it's not true for two reasons. first, to the best of my understanding excel does not support non-central versions of the usual distributions. only the central ones (and it seems like that thing is running some form of excel underneath). second, non-central distributions are generally not symmetric. if you obtaining a symmetric confidence interval whether you're using a central and a non-central distribution (i'm assuming you're using the non-central t as reference for the pearson correlation coefficient), then something's not right.

ditch excel (or that ESCI thingy) and use R. there are packages that will spit out those confidence intervals with one line of code
 
#15
this may (or may not) be true... and i'm inclined to believe it's not true for two reasons. first, to the best of my understanding excel does not support non-central versions of the usual distributions. only the central ones (and it seems like that thing is running some form of excel underneath). second, non-central distributions are generally not symmetric. if you obtaining a symmetric confidence interval whether you're using a central and a non-central distribution (i'm assuming you're using the non-central t as reference for the pearson correlation coefficient), then something's not right.
I was surprised too about the rather symmetric figures I had (after I became aware of the role of non-central distributions :) ). But I thought the large n (177) may have to do something with that.

ditch excel (or that ESCI thingy) and use R. there are packages that will spit out those confidence intervals with one line of code
I looked for R packages but have not found one (probably because I do not always make sense of the descriptions. But I was looking for a package in which I only would have to give some indicators (as r and n), maybe that doesn't work that simple. Do you happen to know a package that I could use to get the correct CIs?
 

spunky

Doesn't actually exist
#16
there is not "one way" to do this, actually. i know of at least three methods depending on which approach you prefer the most.

because i'm biased towards things i like, i'm gonna show you the method i prefer (and point you towards the R packages that use the other two approaches).

the method i like is described here.

you can copy-paste (and declare) the R functions from

here

as a quick example of the function i'm pretending i have some awesome dataset with a correlation of 0.3, and sample size of 178. i'm using a 95% confidence level and declarin iterations = 1000 just for teh lulz

Code:
> CIR.r(0.3, df=176, conf=0.95, iter=1000)
     2.5%     97.5% 
0.1595146 0.4272240
the df=176 comes from the usual significance testing for the correlation coefficient of N-2

the other two approaches i know are in the MBESS package (that uses the non-central distributions we talked about) and the bootES package (this one uses the bootstrap). the first method i showed you before relies on fiducial probability distributions (which are a nice segway to Bayesian posterior distributions) so i like those.

keep in mind that if you use the MBESS package you'll need to reframe your probelm as a regression problem. which is not too big of a deal because in the context of y ~ b*x + e, the 'b' standardized regression coefficient is the correlation coefficient when there are no other predictors.

if you read the link i attached alongside with the documentation for the MBESS package (which you can find here you'll see that finding confidence intervals for STANDARDIZED effect sizes (Cohen's d, Pearson's, rho and basically anything that is worth talking about) is *A LOT* more complicated than it seems initially. which is why i doubt the program you showed me earlier is doing things the way it's supposed to do them.

have fun.
 
#17
Spunky, that's just great, thank you so much for all the information!

I followed the first procedure you've proposed:

> CIR.r(r=.127,df=175,conf=.95,iter=200)
2.5% 97.5%
-0.02089792 0.26875629

Thus, the results are the same as with the other two methods (SPSS Syntax and the excel). But I'm very happy to know about these alternatives with R, as I'll have to work also with other effect size measures - and now I can be sure the calculations are correct :)
It seems to me that the inclusion of zero is only avoidable when statistical test is made two-tailed. At least I know, why it happens. Thank you all very much for your help, I got many thought-provoking impulses.

Best,
grey