Determining best analysis method for health knowledge research

DLU

New Member
#1
Hi,

I have data from a dozen countries capturing knowledge on health-related issues. The same survey, translated appropriately, was used in all countries. After the initial survey, a training was executed in each country, teaching people about the issues addressed in the survey.

Now, we would like to do a post-survey to see if there has been an improvement in knowledge and to compare pre/post change in knowledge between countries. Within countries, we'd also like to compare the change in knowledge on some demographic variables like gender and age. For the post-survey, some less relevant questions would be cut from the original survey but the wording of those questions that remain would be the same as in the pre-survey.

The problem is that the time between the training and the post-survey would vary. That is, it has been 2-3 years since the training in some countries, while in others it has only been 6 months. We would like to conduct the post-survey on the same date for all countries.

Given this information, here are some questions:

- What is the best statistical analysis for analyzing the pre/post data? Initial sleuthing leads me to a multi-factor ANOVA as the best choice (allowing us to control for within country factors like gender and age and analyze between countries), but I'd like to hear the opinions of you stats experts.

- Is it possible to simply control for the amount of time between the training and post-survey between countries to account for these time differences statistically? Or is it just not feasible? Clearly, other factors could influence health knowledge beyond the training as the amount of time since the training increases.

- If the training is rendered irrelevant because of the differences in time periods between countries, could pre/post results still be analyzed (there would still be a difference in time period between the pre and post surveys between countries)?

- We would like to rank order the countries with the greatest change in knowledge to the least. Could this be done statistically? If so, what method should be used?

Thanks in advance for your help--
 

CB

Super Moderator
#2
- What is the best statistical analysis for analyzing the pre/post data? Initial sleuthing leads me to a multi-factor ANOVA as the best choice (allowing us to control for within country factors like gender and age and analyze between countries), but I'd like to hear the opinions of you stats experts.
Deciding on the "best" analysis method is sometimes pretty subjective, but repeated measures ANOVA does sound reasonable. Multilevel modelling also comes to mind, but I don't think this is possible with only two measurement points (others?)

- Is it possible to simply control for the amount of time between the training and post-survey between countries to account for these time differences statistically? Or is it just not feasible? Clearly, other factors could influence health knowledge beyond the training as the amount of time since the training increases.
I imagine you could introduce a covariate into the ANOVA model (time between training and post-survey). The problem is: does simply controlling for length of time really control for those other factors (i.e. threats to validity) that might produce changes in health knowledge? (my answer would be no, it doesn't - the influence of these factors is not necessarily strongly dependent on length of time between training and post-test).

Overall my main concern with your design is how you intend to separate the effects of training as opposed to practice effects, history effects, maturation effects, etc etc etc. Unless you have control groups (preferably randomised), at best you will only be able to say that health knowledge improved (but not that this was necessarily due to the training). This seems a much bigger issue to than the choice of data analysis method or the varying duration between training and post-survey.

- We would like to rank order the countries with the greatest change in knowledge to the least. Could this be done statistically? If so, what method should be used?
I'm not sure you would need a statistical method to do this. Just take the average change in knowledge for each country, put em in a list, and sort them in rank order (e.g. in Excel).
 

spunky

Doesn't actually exist
#3
Multilevel modelling also comes to mind, but I don't think this is possible with only two measurement points (others?)
Actually you can do it... and considering the way in which DLU described her/his research design, she/he should use a two-level hierarchical linear model (a level 1 at people and level 2 at country... perhaps even one more level depending on how the data was gathered)... to handle the difference in training times i think you'd need to tweak the intercepts a little bit in a random coefficients regression model

with regards to how to handle time between pre- and post- survey, that can be done through latent growth curve modeling, although i am a little bit concerned about the fact that each country is not starting at the same point... might need to do some sort of adjustments here as well, or probably model the time changes in clusters of countries that started at similar times...mmhh... would need to have a look at the data though, you dont wanna fall in an ecological fallacy

i do agree with cowboybear that the ranking might not be the hardest part, but i would need to look at the data to be able to decide that... what if there are, in fact, changes in the mean knowledge of each country but these are not statistically significant? i think DLU would need to look a little closer into this...just because i'm doing research on this specfic topic right now (and assuming a lot of things about your survey) i'd do bayesian discriminant analysis to make sure i can maximise the probability of rank-ordering countires

my two cents for your post is: there's WAAAAAAY too much going on in your data for a simple anova/ancova to handle it all...you have (at least) a 2-level nested variances model, growth-variance effects, unequal conditions from the start... i hope (pray?) that just as cowboybear said, you had some sort of randomising procedure here...cuz if not... oh boy, you're in for a treat on this one!! :D

(ps- what is this for? if it's more for like a school project of sorts you might be able to get away with a much simpler analysis. if this is looking like a dissertation/publishable paper i would strongly advice that you look for the help of an expert in these things...)
 

DLU

New Member
#4
Thank you for your responses. Both have been extremely informative.

On the question of randomization, in the ideal, we would have had a control group. However, it's a workplace training and we needed to train all the employees on these issues. It might be possible to ask people in the post-survey whether they participated in the training or not, as some may not have attended the training, and use those who report not attending the training as a control. Thoughts?

I hear you on the effects; it'll be tough to parse out. If all that can be done is to say that there was a change in knowledge without attributing it to the training, I guess that's the best we can do.

The work isn't for a dissertation, but it is something that we would want to present at a health conference that would hold some weight - not a piece of perfection, but not something that can be completely discredited either.

In respect to the country ranking, if we weren’t concerned about whether the training had an effect and only focused on ranking the change in knowledge from the pre- to post-survey, would discriminant analysis allow us to statistically control for difference in time periods in the countries from pre- to post-survey?

Or would the time difference pre/post survey even be a concern with the country ranking? Could we simply qualify in reporting that the time period from the pre- to post-survey varies by country?
 

spunky

Doesn't actually exist
#5
oh i see... uhm. ok, a few comments here and there.

with regards to your first question well... the short answer is "no". people who voluntarily removed themselves from the training group are very likely to share a few characteristics like, for example, let's say they share the "i-dont-like-training" attitude trait. the whole point of randomising is to make your groups as equal as possible so that you can be certain that training does have an effect on people, regardless of whether they have the "i-dont-like-training" trait or not. using a group like that would bias your results.

now, the effects are certainly tough to parse out, but it's not impossible. it is definitely not the kind of situation where, I quote you: "If all that can be done is to say that there was a change in knowledge without attributing it to the training, I guess that's the best we can do". Saying that things (health knowledge, in this case) change with time is not terribly interesting per se. everything changes with time and you certainly didnt need to gather data to come to that conclusion. perhpas you just need to rephrase your question into something a little bit easier to handle. simpler questions usually require simpler analysis. as questions get more and more interesting, the analysis gets more and more complicated.

i suggested bayesian discriminant analysis based on the assumption that you wanted to factor in training effects in the ranking. if you only wish to do the ranking, i'm totally on with cowboybear. a mean of the "health-knowledge" score should suffice. if you wanna get fancier, you can do a repeated-measures ANOVA to assert that the post-training ranking is, indeed, different from the pre-training ranking. i mean, you're still not doing things particularly right because you're ignoring the clusters of data represetned by each country, but i guess you dont have to go over-the-top and crazy with your analysis if you're not willing to do so... besides, pfff.... it's a conference. people are supposed to have fun at those!!!
 

DLU

New Member
#8
Good point about the bias associated with having people report whether they attended the training in the post-survey...

Many thanks for the recommendations on the analysis and ranking! It's ample food for thought to determine how much effort we want to put into the data vs. what we want to get out of it--