Hi all,
I have a questions regarding statistical power and sample size. (1) Do I have to worry about sample size in a multiple logistic regression if I am using all the individuals in a population (census) and not a sample? Let’s say that I want to see how many tourists in a resort report a complaint. This would be my dependent variable (complaint Yes/No). I have 7 independent variables including age, sex, and ethnicity, past complain (yes/no), etc. Again, I am including all guests in a period of time (not a sample). The problem a very small proportion of people report a complain (DV). 271 reported a complaint and 32,469 did not (so less than 1% report a complaint). I wonder that some cells (categories) of my independent and dependent variables will not contain any people since we don’t have too many people who said Yes for my DV. For example, the Asian category may have only 2 people and both did not report a complaint. (2) Would this affect the regression and the pvalues? We expected people with previous complaints to be more likely to complain but in my regressions analysis this is not statistically significant, however OR is 1.6. (3) Can this be affected by the low N? (4) Should I report and consider p-values or not since I don’t have sampling errors? I would appreciate any help/ideas!
Thank you!
I have a questions regarding statistical power and sample size. (1) Do I have to worry about sample size in a multiple logistic regression if I am using all the individuals in a population (census) and not a sample? Let’s say that I want to see how many tourists in a resort report a complaint. This would be my dependent variable (complaint Yes/No). I have 7 independent variables including age, sex, and ethnicity, past complain (yes/no), etc. Again, I am including all guests in a period of time (not a sample). The problem a very small proportion of people report a complain (DV). 271 reported a complaint and 32,469 did not (so less than 1% report a complaint). I wonder that some cells (categories) of my independent and dependent variables will not contain any people since we don’t have too many people who said Yes for my DV. For example, the Asian category may have only 2 people and both did not report a complaint. (2) Would this affect the regression and the pvalues? We expected people with previous complaints to be more likely to complain but in my regressions analysis this is not statistically significant, however OR is 1.6. (3) Can this be affected by the low N? (4) Should I report and consider p-values or not since I don’t have sampling errors? I would appreciate any help/ideas!
Thank you!