Correlation between training events

#1
We offer 5 different training programs (A,B,C,D and E), students often take multiple programs. For those who take multiple programs what is the correlation? For example, 75% of students who took E always took A and C. Or no student took D if they didn't take C, for example

Would like to do this in R. I guess it is not regression but maybe some form of probability.

What I am trying to determine is if we want to increase sales of course C what courses help drive that. (I understand there are other factors but this model is based solely on related courses)
 

Karabiner

TS Contributor
#2
Are you looking for correlation coefficients for binary variables ("A yes/no vs. E yes/no")?
In that case, there are some options (Phi coefficient, Cramer's V, contingency coefficient C,
Goodman and Kruskal's Lambda).

With kind regards

Karabiner
 

hlsmith

Less is more. Stay pure. Stay poor.
#3
@Karabiner - I am guessing they are not interested in correlation coefficients, that is just what they thought they needed. This sounds like a Market Basket problem, given you don't care about the ordering of the classes. ideally, I would want some confidence intervals on these metrics, but I am not familiar enough with associate rules to know how to add them.

https://www.r-bloggers.com/association-rules-and-market-basket-analysis-with-r/

http://r-statistics.co/Association-Mining-With-R.html