# 5 year average data- statistically significant increase?

#### EpiHealthAH

##### New Member
Here is the scenario:
I have 5 year average information for # of reported Campylobacter cases (a bacterial gastrointestinal illness) for a particular region.

# from April to June, 2014-2019

2014 = 2 cases
2015 = 5 cases
2016 = 0 cases
2017= 6 cases
2018 = 11 cases

Which makes the 5 year average for April - June from 2014-2018 = 4.8

2019 =10 cases

I want to know if the 10 cases from 2019 is a statistically significant increase based on the data collected from the previous 5 years. I have some biostats classes under my belt, but it has been awhile so I need some guidance on how to get started (I understand p-values, etc). I have access to STATA if it can easily be calculated using this software.

Thank you!!!

Last edited:

#### Karabiner

##### TS Contributor
Does it make any practical difference whether the result is "statistically significant" or "not statistical signficant"?

In addition, regarding the very small numer of events, a "non-significant" result could easily be attributed to poor power.
In addition, regarding the very small number of events, which consequences would a "statistically significant" result have?

And if the testing was maybe considered only after the fact (i.e. after some seemingly intersting pattern was noticed in
some data), a "statistically significant" result would be very dubious.

Just curious about the context, and why statistical significance testing is invoked here.

With kind regards

Karabiner

#### ondansetron

##### TS Contributor
Here is the scenario:
I have 5 year average information for # of reported Campylobacter cases (a bacterial gastrointestinal illness) for a particular region.

# from April to June, 2014-2019

2014 = 2 cases
2015 = 5 cases
2016 = 0 cases
2017= 6 cases
2018 = 11 cases

Which makes the 5 year average for April - June from 2014-2018 = 4.8

2019 =10 cases

I want to know if the 10 cases from 2019 is a statistically significant increase based on the data collected from the previous 5 years. I have some biostats classes under my belt, but it has been awhile so I need some guidance on how to get started (I understand p-values, etc). I have access to STATA if it can easily be calculated using this software.

Thank you!!!
To best help and provide advice, as @Karabiner has started to do (and question), would you explain what you know about p-values and why that is your proposed solution to answer this question?

#### GretaGarbo

##### Human
Code:
# an R program
da <-
year event
2014  2
2015  5
2016  0
2017  6
2018  11 ")

plot(da$year, da$event)

summary( glm(event ~ year, data = da, family = poisson(link = "log") ) )

There seems to be a statistically significant trend:

Code:
Call:
glm(formula = event ~ year, family = poisson(link = "log"), data = da)

Deviance Residuals:
1         2         3         4         5
0.20914   1.29673  -2.83895  -0.06906   0.49160

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -857.6195   325.3241  -2.636  0.00838 **
year           0.4261     0.1613   2.642  0.00825 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for poisson family taken to be 1)

Null deviance: 17.828  on 4  degrees of freedom
Residual deviance: 10.031  on 3  degrees of freedom
AIC: 28.034

Number of Fisher Scoring iterations: 5

And if the testing was maybe considered only after the fact (i.e. after some seemingly intersting pattern was noticed in
some data), a "statistically significant" result would be very dubious.
This is important to consider. The hypoteseis suggested maybe by the data (and not the other way around as it is usually assumed in statistical testing).

#### EpiHealthAH

##### New Member
Does it make any practical difference whether the result is "statistically significant" or "not statistical signficant"?

In addition, regarding the very small numer of events, a "non-significant" result could easily be attributed to poor power.
In addition, regarding the very small number of events, which consequences would a "statistically significant" result have?

And if the testing was maybe considered only after the fact (i.e. after some seemingly intersting pattern was noticed in
some data), a "statistically significant" result would be very dubious.

Just curious about the context, and why statistical significance testing is invoked here.

With kind regards

Karabiner
I have a list of about 40 different diseases with this information. I would like to state in the quarterly report which diseases we are seeing a significant increase in. I figured the small number of events would be an issue. If there is a significant increase in certain diseases we may try and find an explanation for why there is a significant increase (increase in testing, outbreak, etc).

#### EpiHealthAH

##### New Member
To best help and provide advice, as @Karabiner has started to do (and question), would you explain what you know about p-values and why that is your proposed solution to answer this question?
I I have a list of about 40 different diseases with this information. I would like to state in the quarterly report which diseases we are seeing a significant increase in. I figured the small number of events would be an issue. If there is a significant increase in certain diseases we may try and find an explanation for why there is a significant increase (increase in testing, outbreak, etc).
I'm not sure if p-values are the solution. I was just stating that I have knowledge of terminology so someone wouldn't feel like they had to explain the basics.

#### ondansetron

##### TS Contributor
I I have a list of about 40 different diseases with this information. I would like to state in the quarterly report which diseases we are seeing a significant increase in. I figured the small number of events would be an issue. If there is a significant increase in certain diseases we may try and find an explanation for why there is a significant increase (increase in testing, outbreak, etc).
I'm not sure if p-values are the solution. I was just stating that I have knowledge of terminology so someone wouldn't feel like they had to explain the basics.
This leads me further down the path: why is it that you want to report significance? What does significance and a p-value mean in your perspective (i.e. practically and definitionally)?
Small numbers of events aren't necessarily and issue. But I think hearing your answers might provide insight into the practical intent and then we can provide some suggestions.