# Relationships in categorical data

#### galegirly

I have 200 health practitioners choosing one of six treatments after reading a case study. Then i ask them what info from the case study did they use to make their decision - info A, B, C, D or E (they can choose more than one). Is there a relationship between choosing any of the treatments, and the info A, B, C, D or E that they prefer to use?

What statistical test - i'm thinking a series of Chi-squared tests, one for each of A, B, C, D or E?

As you can see i'm at an early stage here in my journey with statistics!

#### Karabiner

How many of the 32 possible configurations of A B C D E do appear?

#### katxt

Is there a relationship between choosing any of the treatments, and the info A, B, C, D or E that they prefer to use?
Just trying to get my head around it. Can you make up an example of what a conclusion might look like if all goes well? Cheers.

#### galegirly

Thanks Karabiner and katxt.
How many of the 32 possible configurations of A B C D E do appear?

Well we are working on the protocol at the moment, so no real data yet, but we do want to have a viable plan for statistical analysis before we start. So, what if we structured it like this?

200 study participants read a case scenario (lets make them hair-dressers, it might be less boring for you)
Case scenario - male client with thinning hair - what intervention will you apply?
1. Apply cocoa butter
2. Cut it so that it can be combed over the balding patch
3. Shave it all off
4. Pretend to cut it, but leave it as is
5. Tell him to come back later
6. Refer him for hair transplant in Mexico

Second question to all study participants - what type of information was most influential on your decision? Rank these in order of importance:
A. Research
B. Client's case history
C. Professional intuition
D. Regional Policy
E. Astrological info

I am assuming we will have a dataset as follows:

we could do the same with second preference source of information, and so on, or group 1s and 2nd, 3rd 4th and 5th.

Laying it out like this, and thinking about the fact that we will be doing this with 5 separate samples (200 in each) and looking for differences in the preferences of study participants between samples - Chi-squared seems a bit too simplistic??

#### galegirly

Just trying to get my head around it. Can you make up an example of what a conclusion might look like if all goes well? Cheers.
Thanks Katxt. please see above where i have turned into an example. The type of conclusions we would forsee: practitioners who choose the riskiest intervention (intervention 1 or whatever) are significantly more likely to have used professional intuition (Info C, or whatever) to inform their decision

#### Karabiner

Second question to all study participants - what type of information was most influential on your decision? Rank these in order of importance:
Ok, that's a useful measure, but I would recommend that nevertheless you think very hard about it.
Is this really what you find out? The ranking does not reveal the absolute importance. Say, one
partcipant based his decision 95% on research, 2% on case history, while the next bases his decision
33% on research and 32% on case history; the ranking would be the same, but the importance would
be very different.

Apart from that, one could examine each information type and have a look on which rank was
associated with each decison. Say, "research" had a median rank of 2 in those who chose
option1, a median rank of 3 in those who chose option2, and so on. The same for case history
etc.

Or, the other way around, one could focus on decisions and show the median ranks of each information
type, separately for each decision: for "Decision 1" the median ranks were 2 for A, 3 for B, 3 for C. (etc.).

Just my 2pence, for the moment.

#### katxt

we do want to have a viable plan for statistical analysis before we start.
A very refreshing stance, so often forgotten until the data is in.

#### galegirly

Ok, that's a useful measure, but I would recommend that nevertheless you think very hard about it.
Is this really what you find out? The ranking does not reveal the absolute importance. Say, one
partcipant based his decision 95% on research, 2% on case history, while the next bases his decision
33% on research and 32% on case history; the ranking would be the same, but the importance would
be very different.
Yes, thank you Karabiner. Ranking in the way i have suggested is crude. We could set it up so that participants can assign an actual % to each info source. Participants won't want to work out percentages, but we could allow them to apportion a pie chart in a survey application. That would make it a more precise measure, wouldn't it? (albeit a self-report and im not entirely confident practitioners will be honest, even with themselves, about how little current research affects their decision-making)

Have to go, will return to this tomorrow. Really appreciate the input

#### galegirly

Karabiner if we force participants to split the importance they attach to each of 5 information sources your suggestion to report ranks is no longer applicable, is that right? You said "one could focus on decisions and show the median ranks of each information type, separately for each decision: for "Decision 1" the median ranks were 2 for A, 3 for B, 3 for C. (etc.)."

So if study participants are assigning a % to represent the importance of each of info A,B, C, D or E, what statistical test would help find relationships between the intervention they chose and the info A,B,C,D or E they used to make a decision? The data would like this, i have 2 countries here, there would be 5 countries (n=200 in each country).

#### Karabiner

The comparison between nations is a new aspect.
What is the focus of your research, how do the research questions look like?

Regarding ranking versus assigning importance points (or something like that),
it really depends on what exactely you want to find out.

#### galegirly

Yes sorry to spring that on you, we have researchers from 5 countries involved.

OK Perhaps this will explain further - data for an individual study participant will look like this

So, as you can see, they will respond to 4 case studies. Your questions have prompted me to tighten up the analysis framework as follows:

A bit more background
The 6 interventions are social work interventions in child protection work. The six interventions could be described as ordinal: Intervention 1 is “do nothing”. Intervention 6 is a court ordered removal of a child from their home. And 2,3,4 and 5 are mounting levels of family interference between 1 and 6 (in approximate terms).

There are 4 statistical analysis tasks: 3 of these are descriptive tasks, and 1 is explanatory/correlational
• Are there significant differences, between countries, in choice of intervention?
• Are there significant differences, between countries, in the type of information used to inform decisions?
• Is there a typology? Are there identifiable types of social workers who are more likely to choose particular interventions, and/or who draw upon particular information to make their choices)
• Is there a correlation between the type of information social workers use to make decisions and how risk averse/interventionist they are (which intervention they choose?
Why is this study useful?
Study findings will inform recommendations for social work practice, education, training and supervision.
Potential recommendations relating to 1. the country differences in choice of intervention, such as
• Social workers in Ireland are more risk averse, and are more likely to choose a child-removal intervention in child protection work – (why is this finding of value? Maybe they are too risk averse - (i) leads to a discussion of socio-political context in Ireland, is that the reason (ii) consideration of child maltreatment data – are children safer in Ireland also? These are important discussions which should lead to further research)
Potential recommendations relating to 2. country differences in the type of information social workers are using to inform decisions, such as
• Social workers in the United States are significantly more likely to draw upon current research to inform child protection decisions – (why might this be worth finding out? – is there a link to a greater emphasis on desk-top research skills in US social work training? Are child protection research findings better disseminated in the US? There may be implications for social work training, and research dissemination outside the US).
Potential recommendations relating to 3. the identification of a social worker typology (there are identifiable types of social workers who are more likely to choose particular interventions, and/or who draw upon particular information to make their choices) such as
• If a typology is identifiable - this underlines the importance of social workers making joint decisions, in groups, and with other disciplines involved. If we can disseminate an evidenced-based typology of practitioners to the profession it may help the profession focus on the development of more rounded, more objective? practitioners.
Potential recommendations relating to 4. is there a correlation between the type of information social workers use to make decisions and how risk averse/interventionist they are?
• Social workers relying primarily on their knowledge of legislation are more likely to remove children from their family – implications for staff development/social work training

#### Karabiner

Are there significant differences, between countries, in the type of information used to inform decisions?
So you will have to find a way to aggregate, for each individual, how information was used across 4 scenarios.
For example, if you asked for a ranking, then "median rank of Information A [B, C, ..] across the 4 tasks" could
be used; or, if you asked for a score or a percentage "mean score (or percentage) across 4 tasks". You could then
compare these aggregated values between nationalities.

Their might be an issue with consistency - one participant perhaps uses information always the same way, another one
uses different information in different situations. Alternatively, a repeated measures approach or a multilevel model
could be useful. In a repeated-measures approach you could use "type of information" and "scenario" as repeated-measures
factors, and "nationality" as additional between-subjects factor.
Is there a correlation between the type of information social workers use to make decisions and how risk averse/interventionist they are (which intervention they choose?
Degree of risk aversion is an ordinal scaled dependent variable. If you have interval scaled values for information
utilization (such as "for person k, mean importance for informations A to E were 11, 22, 13, 32, 22"), you could
regress median degree of risk aversion on this (ordinal regression). If this means too much aggregation on both
sides of the equation, maybe a multilevel approach (generalized estimating eqiations GEE) could be considered.

#### galegirly

Thank you karabiner. That gives me a ball park to work in. I think I will create some mock data and see were those ideas take me.
Their might be an issue with consistency - one participant perhaps uses information always the same way, another one
I guess we could increase the number of case scenarios, from 4, to a number that offers a more reliable insight. There is probably a statistical method of choosing the lowest possible number of case scenarios needed for a reliable insight into participants' risk aversion? We could do some study pilot work on this.