# comparing frequencies between time frames - best choice of analysis?

#### bridieg

##### New Member
Hello!

I am currently undertaking a research project investigating the rates of violent assault across a 6-month period between 2019 and 2020 - looking at the latter in the scope of a COVID-19 period. The data is as such:

2019 – 62 assault TBIs
March: 12
April: 10
May: 12
June: 12
July: 5
August: 11

2020 – 34 assault TBIs
March: 9
April: 4
May: 6
June: 4
July: 4
August: 7

I am hoping to explore whether the difference in injury frequencies between these time frames is significant, and understand a chi-square may be applicable for this. However, I am finding it difficult to produce the correct data for this type of analysis on SPSS - these frequencies do not seem to be recognised easily by the program.

Any advice would be greatly appreciated!

#### katxt

##### Active Member
Most calculators start with a 2x2 table or bigger. Yours is a 2x1 using the totals (unless you want to include the months). With 96 total, you would expect 48 in each if there was no difference between the years. Try Excel.

#### obh

##### Active Member
Hi Bridieg,

Do you want to compare the totals? or the distribution between the months?

Not sure about SPSS, you may also use R
If you want to compare only the total:

Code:
> obs<-c(62,34)
> prob<-c(0.5,0.5)
> chisq.test(x=obs, y=NULL, correct = FALSE, p=prob)

Chi-squared test for given probabilities

data:  obs
X-squared = 8.1667, df = 1, p-value = 0.004267

#### bridieg

##### New Member
Hi Bridieg,

Do you want to compare the totals? or the distribution between the months?

Not sure about SPSS, you may also use R
If you want to compare only the total:

Code:
> obs<-c(62,34)
> prob<-c(0.5,0.5)
> chisq.test(x=obs, y=NULL, correct = FALSE, p=prob)

Chi-squared test for given probabilities

data:  obs
X-squared = 8.1667, df = 1, p-value = 0.004267

Amazing, thank you!

I would like to compare the distribution between months if possible - again, I'm not entirely sure how to do this...

#### Karabiner

##### TS Contributor
So you have paired observations (month-wise) and a sample size of n=6 months.
You can arrange your data in 6 rows and 3 columns (one for the name of the month,
two for the respective assault frequencies), and perforn a Wilcoxon signed rank test,
which is suitable for paired observations in small samples.

With kind regards

Karabiner

#### hlsmith

##### Less is more. Stay pure. Stay poor.
The counts are low, but I would recommend creating a line plot. Ideally you would want to rule out that the trend in incidences was not decreasing prior to COVID-19. We can see it wasn't, but examining it solidifies ruling out that counterfactually if COVID-19 wasn't happening we wouldn't have the same number of counts.

Pairing data implies there could be a seasonality in the incidences, do you believe this?
Also, are assaults independent? Meaning it isn't the same abuser multiple time?
Also, just looking at counts also implies the denominator is the same across time, otherwise rates should be used?

#### noetsi

##### Fortran must die
Interrupted time series is an option. There is a form of regression that will also address this. When I remember I will post it.

#### bridieg

##### New Member
So you have paired observations (month-wise) and a sample size of n=6 months.
You can arrange your data in 6 rows and 3 columns (one for the name of the month,
two for the respective assault frequencies), and perforn a Wilcoxon signed rank test,
which is suitable for paired observations in small samples.

With kind regards

Karabiner
I don't believe this test is applicable unfortunately, as the 2019 and 2020 patients are not from the same sample!

#### bridieg

##### New Member
The counts are low, but I would recommend creating a line plot. Ideally you would want to rule out that the trend in incidences was not decreasing prior to COVID-19. We can see it wasn't, but examining it solidifies ruling out that counterfactually if COVID-19 wasn't happening we wouldn't have the same number of counts.

Pairing data implies there could be a seasonality in the incidences, do you believe this?
Also, are assaults independent? Meaning it isn't the same abuser multiple time?
Also, just looking at counts also implies the denominator is the same across time, otherwise rates should be used?
Yes I've graphed these frequencies out! It is a good graphical representation for sure

Yes the assaults are independent, as are the samples - all different cases from 2019 to 2020.

When you say rates, do you mean percentages? Or the proportion of assaults within the total number of injuries (as this is from a database of hospitalisation presentations for head injuries, with those caused by assault of primary interest here)

#### Karabiner

##### TS Contributor
I don't believe this test is applicable unfortunately, as the 2019 and 2020 patients are not from the same sample!
There is a misunderstanding. Not the patients are the sample, but the n=6 months are the sample.
The sample elements (the months) were measured twice, once in 2019 and once in 2020. The
dependent measure is the "number of patients" in each month in 2019 or 3030, respectively.

I suggested this because you said "I would like to compare the distribution between months".
But maybe there's also a misunderstanding on my side.

With kind regards

Karabiner