# Compare election turnout percentage changes to detect election fraud if there is

#### stevevaius

##### New Member
Hi all, I have two datasets belonging to two periods of the presidential election. In the first round of the election, there were 5 candidates and close to 1000 polling stations. I have all votes for each candidate and the polling stations' registered voter numbers with absentees. The election went to the second round but only two candidates competed. Again I have same type of data as the first round. Now I would like to test whether in the second round turnover numbers for each polling station changed statistically or not. There are sure changes in voter turnout numbers but how statistically tested to say something about it?

#### Karabiner

##### TS Contributor
Not sure what you precisely mean by "changed statistically" or "statistically tested".
If you have 1000 polling stations, and hundreds of thousands voters, you can
make descriptive statistics about turnout in elections 1 and 2. What else do you
have in mind, or which model or exact hypothesis do you want to test?

With kind regards

Karabiner

#### stevevaius

##### New Member
Thank you for your reply again. Between two election rounds, there are changes such as voter turnout numbers and supporting votes to candidates (except 5 candidates there were two candidates in the second round) Which statistical test is proper to use (t-test paired samples, z-test independent samples, etc...) to test whether percent changes differences between two round of elections voter turnouts normal or not (assuming normality based on 0 mean difference btw percentage changes...do not sure if I can assume this) for each poll station?

#### Karabiner

##### TS Contributor
Do you mean, you want to test whether the poll station-wise distribution of the changes follows some
kind of assumption, or violates that assumption, and/or whether there are polling stations with
noticeable values? So, if your unit of observation are the single poll stations, and your dependent
variable(s) are numerical as in "% change of ... [whatever]" or "frequency change of... [whatever]",
then you could start with visualization: histograms, boxplots, Q-Q-plots, P-P-plots. The main question
is, whether distributional assumptions (like normality) are plausible for this kind of data (personally, I do
not know that - but worldwide, studies on election fraud do exist, and might present examples
you can follow).

With kind regards

Karabiner