Where to start..?

I am currently carrying out research on A&E attendance for my surgical speciality pre and post covid. I have collected data from around 3000 attendances, and have been staring at the data for two months now, unsure of how to progress. The aim is to see whether the lockdown had an affect on a) number of injuries b) type of injuries c) alcohol/drug usage contributing to injuries d) any other comparisons e.g. age/gender.

The data involves comparing the same dates (between certain months) pre covid and during the covid lockdown for attendances to A&E. The attendances are separated into groups for age, gender, type of injury, how it was sustained and whether there was alcohol/drug use involved.

My research leads me to believe I am to use a mann-whitney u test/wilcoxon rank sum test as my data sets are unpaired and do not follow normal distribution.

I would appreciate any help with how to progress from here in term of statistical research and finding whether there is any statistical significance.
I have the data sets to hand on a spreadsheet (it is colour co-ordinated and not tiring to look at). As you can probably tell, I am quite new to this and so any contributions are fully and warmly welcomed.

Many thanks in advance to any contributors!
Last edited:


TS Contributor
Some ideas you can maybe find here.

If I understood your description correctely, then you want to compare number of attendances with injuries at period1
versus period 2. This could be done with a one-sample Chi² test (Null hypothesis: The proportion of cases in period 1 =
proportion in period 2 = 50%/50%).

"Type of injuries" could be interpreted in several ways. Maybe you want to know whether the distribution of injuries
(relative proportions) was the same for both periods. This would be a period * type of injury table with a Chi² test.
If you want to know whether number of cases changed for a specific injury, then you do it just like with the first question.

If "Alcohol/drugs contributing" is a yes/no question, then again 2 possibilities, either you want to compare the proportions
(period * contribution yes/no), or just the frequencies between the periods (one-sample test). Gender is again a 2*2 table.

For age, you can compare means between periods using for example independent samples t-test. Normality of the dependent
variable is irrelevant for a t-test; and even the real assumption for the t-test, that the dependent variable should be normally
distributed in each of the 2 populations from which the samples were drawn, can be safely be ignored here, since sample size
is n >>> 30..

By the way, keep in mind that "statistically significant" does not mean relevant, large, important. It just means that one
rejects the Null hypothesis "in the population the effect is exactely 0.00000000000000". With such a large sample size as
yours (i.e. small statistical standard error of the estimation), descriptive statistics are more interesting than tests of significance,
in my humble opinion.

With kind regards

Last edited: