Hoping someone might help point me in the right direction with some questions I had (newby at stats, with 0 experience doing analysis with questionnaire data).

Let’s say I have some participants who are administered a questionnaire at Baseline and then at 6, 12, 18 and 24 months (see table below).

The questionnaire has a total of 6 items across 2 scales (each scale has 3 items). One of the scales asks about physical functioning, the other about emotional functioning and there are 5 categories (0=never, 1=almost never, 2=sometimes, 3=often, 4=almost always) for response to each item, according to the extent of a problem that each of the items has posed for the participant during the past one month.

If my goal is to assess any potential quality of life benefits of standard of care treatment with two different drugs (some participants are on drug A, some on B) by examining change over time in score, what is a good way to do that?

I was thinking of analyzing the scales separately (not sure if I should) and presenting three tables for each scale: one with raw scores (n, mean, median) at baseline, one with score change from baseline to each of the 4 visits, and one with results from some modelling (maybe linear mixed model?)

My question is: if I want to take this approach, is it ok to:

Sum up the scores for Q1, Q2, Q3 for each participant at baseline, 6, 12, 18, 24months and then calculate mean and median of these individual sum scores? For example, for P1 at baseline, we’d have P1 score = (0+0+2) = 2, P2 score=2, P3=missing and so on. Then I’d get mean score as (P1+P2+ … +P4 scores)/(number of participants), get median score and this is what I’d present in the raw scores table. Then for the score change from baseline table, I’d subtract the mean (median) at each of the timepoints from the mean (median) at baseline. And finally, for the modeling part, I’d use the mean (median) score as my outcome variable in a linear mixed model.

I’ve done some research and am aware that there’s A LOT of fierce debate on whether ordinal data should be treated as interval (which is what I’m doing here) or not, but honestly, the more I get into the details, the more confused I get about what my approach should be..

Thank you kindly for any advice/input, it’d be greatly appreciated!

Let’s say I have some participants who are administered a questionnaire at Baseline and then at 6, 12, 18 and 24 months (see table below).

The questionnaire has a total of 6 items across 2 scales (each scale has 3 items). One of the scales asks about physical functioning, the other about emotional functioning and there are 5 categories (0=never, 1=almost never, 2=sometimes, 3=often, 4=almost always) for response to each item, according to the extent of a problem that each of the items has posed for the participant during the past one month.

If my goal is to assess any potential quality of life benefits of standard of care treatment with two different drugs (some participants are on drug A, some on B) by examining change over time in score, what is a good way to do that?

I was thinking of analyzing the scales separately (not sure if I should) and presenting three tables for each scale: one with raw scores (n, mean, median) at baseline, one with score change from baseline to each of the 4 visits, and one with results from some modelling (maybe linear mixed model?)

My question is: if I want to take this approach, is it ok to:

Sum up the scores for Q1, Q2, Q3 for each participant at baseline, 6, 12, 18, 24months and then calculate mean and median of these individual sum scores? For example, for P1 at baseline, we’d have P1 score = (0+0+2) = 2, P2 score=2, P3=missing and so on. Then I’d get mean score as (P1+P2+ … +P4 scores)/(number of participants), get median score and this is what I’d present in the raw scores table. Then for the score change from baseline table, I’d subtract the mean (median) at each of the timepoints from the mean (median) at baseline. And finally, for the modeling part, I’d use the mean (median) score as my outcome variable in a linear mixed model.

**OR**, would it be better if instead of summing up the Q1, Q2, Q3 scores for each participant at each visit, I instead average them by dividing their sum over number of questions answered? For example, P1 score = (0+0+2)/3, P2=(2+0)/2 and so on. And then I take calculate the mean and median for all participants based on these average scores?I’ve done some research and am aware that there’s A LOT of fierce debate on whether ordinal data should be treated as interval (which is what I’m doing here) or not, but honestly, the more I get into the details, the more confused I get about what my approach should be..

Thank you kindly for any advice/input, it’d be greatly appreciated!

Last edited: