Thread: Probability of association in binary variables (allergy dataset)

    Probability of association in binary variables (allergy dataset)


    I am an MD analyzing a series of patients who have undergone allergy testing. For each patient (20000 in total), there are 40 tested allergens for which the outcome may be positive or negative. I have first calculated the overall frequency of positive results for each allergen. Next, in order to understand the association between pairs of allergens, I have looked at all pairs of allergens and quantified the number of times that both members of a pair are both positive in the same patient.

    I would like to quantify the extent to which the results of each pair of allergens are associated in excess of that predicted by chance. For example, if I have allergens A and B with overall frequencies of positive results 0.1 and 0.05 in the population, and I have observed 100 out of 20000 patients positive for both, what is the probability that this is a true association?

    I am aware that a binomial distribution would be used for a single tested variable, but am not sure what to use for a pair of variables. Many thanks in advance for any help.

    Re: Probability of association in binary variables (allergy dataset)

    A crude analysis would be 2x2 tables (780 in total) with Chi2 or Fisher's exact test.
    To determine the appropriate level of significance could be difficult, though, because
    of multiple testing.

    Just my 2pence


