I am building a binary 0/1 logistic model where the response rate is very low (0.6%, 77000 out of 12000000). One suggestion was to artificially make the response rate 2.4% (by excluding some 0's), and then build a model by selecting a sample from that dataset. However, then I need to ensure that the final average response rate in the sample still remains 0.6%. Is there any way I can do this?
Thanks in advance.
Advertise on Talk Stats