Binary logistic regression

#1
Dataset:
-one column of binary data (0 or 1)
-18 columns of variables which go roughly between -200 and 200

Aim:
To create a model which can predict the dependent variable based on the independent variables. In the future on data which is not available yet.

Note:
The model doesn't need to give a prediction (0 or 1) all the time. It's fine if the model would say "I don't know". That would be better than giving a wrong prediction on the dependent variable.

Does anybody have an idea of how to do this? If you use binary logistic regression I don't see how to include the "I don't know" part.
 

hlsmith

Not a robit
#2
Well you don't need to use the 0.5 probability cutoff for outcome classification. You can try different ones based on revising confusion matrices. You could also split predicted probabilities into 3 groups, yes, no, maybe.
 
#3
Thanks for your reply.

If I don't need to use the 0.5 probability cutoff for outcome classification, what should I use instead?

In the sample, the 0/1 is 50/50 distributed. Actually I can add the file here.

Dataset:
-one column of binary data (0 or 1)
-36 columns of variables which go roughly between -200 and 200

I've uploaded the data here: https://ufile.io/vonpf