# Basic R student question

#### Eomund

##### New Member
Heya,

I'm studying International studies in management on a german University and ran into
an encounter with "R".
We're not supposed to indulge ourselfs fully in the program as it is just an excourse during this Semester(And about half my grade for Market research).
I'm supposed to do an Analysis of variance with R and must admit, I'm clueless.
I successfully created a CSV, which R can read, but from there I'm absolutely lost.

The whole thing is about a questionaire, which asked how afraid different people were of health risks, that come in hand with the consumption of energy drinks.
We had 4 different Age groups: 16-19, 20-24, 25-30 and 30+.
The answer went from 1-unafraid, to 6-very afraid.

I added both CSV, one with the whole list of answers people gave and the other, with a simplified table of "total afraids" and "total unafraids".

Now if anyone can help me take this step by step to the Anova, I'd more than apreciate, because basicaly "this" is all I need to do in "R" for my whole studies.

EDIT: It doesn't actualy matter which of the two files leads me to the result, of course a solution for the "bigger" dataframe that filters out, if "afraid">=4 and "unafraid"<=3, would be more elegant.

Eomund!

#### Eomund

##### New Member
Maybe I wasnt specific enough. My problem is more that I don't know how to make "R" recognize my Variables.
How can I make R understand that there's 4 age groups and that a value of below 3 means unafraid and a value of 4 or more means afraid.

How can I make it understand that theres X people afraid and Y people unafraid.

Ive been reading through Avona, but this is simply beyond my capabilities at the moment to comprehend. I'll be reading more, but if you could point me to the right direction or tell me what functions i should look up in detail, i'd be forever indebted to you*cry*.

Edit: Don't get me wrong, I'm normaly reading my way through these things to come up with solutions, I've just not touched this program before:/

#### jhartsho

##### New Member
The problem is the way your variables are set up in the table. You're comparing a single column to a single column. For some reason I can't attach my csv file so I'll type out how I did it:

Age Status Count
16 Afraid 2
20 Afraid 12
25 Afraid 4
30 Afraid 0
16 Un 1
20 Un 23
25 Un 6
30 Un 2

I hope it's easy to see that those are 3 separate columns (the status one is just for me to easily see which counts belong to which). R can recognize your divisions in age groups so all you have to do is write your code

Code:
(age.aov<-aov(Count~Age,data=table)
summary(age.aov)
My concern is with your analyses. You are working on a scale of 1-6 where the answers are in a significant order. That should really be analyzed as a multinomial logistic regression. I don't know if your prof specifically asked for it to be analyzed as an anova but, in this case, that really wouldn't be the appropriate method. As a side note, a multinomial logistic data table has to be set up differently than what I gave you above. Hopefully this helps.

Edit: typo.

#### Eomund

##### New Member
I changed the Table to this:

https://www.dropbox.com/s/35mvay0zgwxbdd6/Energy drink.csv

and went with the following aproach(successful):

model_eomund = aov(Afraid ~ Age, data)
summary (model_eomund)
summary(glm(Afraid ~ Age, data, family = binomial))
model_eomund = aov(Afraid ~Age, data)
summary (model_eomund)

Swapped to 1=afraid, 0=unafraid

Evaluated the thing with plot(data)

#### jhartsho

##### New Member
Awesome! Glad you got around to the glm.