How can I determine if two graphs are statistically significantly different

M

Mary12

Guest
#1
Hi, I'm working on a medical research project and need some stats help. I have limited statistical knowledge but I understand that basics. I need help with the following...

I have two graphs:

A: y = -0.000003x^3 + 0.000664x^2 - 0.028609x + 0.333098
B: y = -0.000003x^3 + 0.000781x^2 - 0.042135x + 0.627823

How can I determine if these two curves are statistically significantly different from one another?
 

hlsmith

Omega Contributor
#2
Do they cross within your sample space? If so the slopes aren't parallel.


Can you provide some more background information on what you are doing? And is x^3 = to coefficient times (x*x*x), so it you are working with polynomials?
 

rogojel

TS Contributor
#3
hi,
actually there is no such thing as a statistically significant difference in graphs. Are we talking about different models here?

regards
 

ondansetron

TS Contributor
#4
hi,
actually there is no such thing as a statistically significant difference in graphs. Are we talking about different models here?

regards
I think OP may be looking for a difference in functions, similar to the way you can test for differences in survival curves.
 

rogojel

TS Contributor
#6
I think OP may be looking for a difference in functions, similar to the way you can test for differences in survival curves.
For a statistical test you would need some random component. This would work if the two equations were the regression models of some phenomena in which case the question would also make sense, a bit reformulated, like is there significant difference hetween the two models? IMO this is the case for survival curves as well. In comparing two graphs, without this random component, there is no definition of statistical significance that could be applied.

regards
 

ondansetron

TS Contributor
#7
For a statistical test you would need some random component. This would work if the two equations were the regression models of some phenomena in which case the question would also make sense, a bit reformulated, like is there significant difference hetween the two models? IMO this is the case for survival curves as well. In comparing two graphs, without this random component, there is no definition of statistical significance that could be applied.

regards
Are you familiar with survival analysis techniques where a very descriptive initial, early step can involve a test of hypothesis that the survival functions in each of the k-groups are equal (specifically, Kaplan-Meier, if I recall)? It's similar to an ANOVA test of hypothesis (Ho: S(a)=S(b)=...=S(k)), but instead of means, you're dealing with survival functions.

Again, I don't think the OP literally meant comparing two graphs, but rather comparing functions. This is a common question and test of hypothesis covered in many survival analysis texts. I direct you to Hosmer and Lemeshow for a survival text if you're curious for a refence.
 

rogojel

TS Contributor
#8
hi ondasetron,
how can you talk about a null and an alternative hypothesis, if you simply check whether f1(x)= f2(x) for x= x1,x2...etc?

regards
 

ondansetron

TS Contributor
#9
hi ondasetron,
how can you talk about a null and an alternative hypothesis, if you simply check whether f1(x)= f2(x) for x= x1,x2...etc?

regards
Imagine a case of a clinical trial with a treatment group and a control group. After 5 years of follow up, we end the trial. Without using the specific details here, assume we calculate the % surviving at time T in each group. Each curve is an estimate for the true survival function in each of populations represented by our sample groups. The question we can ask then, is "Are these functions different enough at some evidential threshold to suggest that the survival in these two populations is described by different functions?" The general idea is then to see whether our observations of survival markedly differ than what we would expect if there really was no difference in survival between the two groups. The null is then Ho: S(treatment)=S(control) with Ha: S(treatment) ~= S(control)...[generalized as at least 2 true survival functions differ between the k groups]

See section 3.4 and 3.4.1 here if you cannot access the H&L book.
 

rogojel

TS Contributor
#10
Each curve is an estimate for the true survival function in each of populations represented by our sample groups.
Right, this is what I mean. In this case you have a two models (what you call, estimate of the survival function, and a random variation around it. It makes perfect sense to talk about a null and an alternative hypothesis. H0 being, the differences are only due to the said random variation and we have one model describing both data sets.

The general idea is then to see whether our observations of survival markedly differ than what we would expect if there really was no difference in survival between the two groups. The null is then Ho: S(treatment)=S(control) with Ha: S(treatment) ~= S(control)...[generalized as at least 2 true survival functions differ between the k groups]
You talk about the "true" survival functions , of which the observed values are realizations of, compounded by some random error, which is modelled BTW.
This is all perfectly legit.
It also means you can not apply this reasoning to a simple comparison two curves, without the knowledge of the underlying statistical midels. - you have no concept of the "true" curve and definitely no idea of the random errors - so, no sensible definition of an H0 and Ha.
 

hlsmith

Omega Contributor
#11
All good points. I think rogojel is being a little bit more meta. Yes, results are dependent on the model, which is wrong but. Informative. Survival curves analysis just compares two curves similarly to slopes in linear reg. Which we won't contest. Rogojel is still focusing on the premise behind sampling dist and that the model needs random variable for "statistical tests" to be used in comparing slopes, which we assume the OP has.
 
#12
Right, this is what I mean. In this case you have a two models (what you call, estimate of the survival function, and a random variation around it. It makes perfect sense to talk about a null and an alternative hypothesis. H0 being, the differences are only due to the said random variation and we have one model describing both data sets.



You talk about the "true" survival functions , of which the observed values are realizations of, compounded by some random error, which is modelled BTW.
This is all perfectly legit.
It also means you can not apply this reasoning to a simple comparison two curves, without the knowledge of the underlying statistical midels. - you have no concept of the "true" curve and definitely no idea of the random errors - so, no sensible definition of an H0 and Ha.
I see what you were saying now. We were talking past one another a bit at first!