Considering your response is binary, you shall be looking at a logistic regression.
I have a data like this. I need to find 7 predictors' coefficients for the best model to predict dependent. It can only be 1 and 0.
Code:A B C D E F G Dependent 1000 20 -4 150 -567 -83 10 1 -400 35 3 78 341 45 -9 0
When "c" stands for coefficient;
Dependent is correctly predicted 1 if cA+cB+cC+cD+cE+cF+cG is positive regardless of the actual number.
Dependent is correctly predicted 0 if cA+cB+cC+cD+cE+cF+cG is negative regardless of the actual number.
I need to get the optimum coefficients for the best model at predicting dependent.
Considering your response is binary, you shall be looking at a logistic regression.
I only worked with linear regressions whether it's penalized or not, weighted or not.
When it comes to logistic regression, I'm completely clueless. What I'm really struggling with is how can I make the regression only care about if cA+cB+cC+cD+cE+cF+cG is positive or not regardless of the value. Response should be 1 for positive 0 for negative. Thus, it should give the optimum coefficients.
So are you saying you're against learning Logistic regression? Because it's much more appropriate when your response is binary.
I don't have emotions and sometimes that makes me very sad.
Nope. I'm completely open to learning. I just couldn't find how to make it work for my case. If you show me one example that's resembling what I'm trying to achieve, I can get it. For example would you be able to calculate the coefficients for A B C D E F G which I gave in the OP?
I believe I couldn't explain myself clear enough. To make it simpler;
1. I have 7 metrics that predict the outcome of basketball games (Win or Lose)
2. If a given metric's value is positive and the response is 1 (Win) or if that metric's value is negative and the response is 0 (Loss), that metric succeeded at predicting the outcome. Otherwise it did not. Metric values are not important as long as they predict the outcome accurately.
3. I want to make a blend of those 7 metrics that's best at predicting the outcome.
When I worked with score differential, I easily calculated it via any type of linear regression. However, I'm struggling here because the response is dependent on if the metric's value is >0 or <0 and not the value itself.
Well, I'm ashamed to admit I don't really know how it works but simple logistic regression without doing anything else achieved what I wanted to achieve. In R I simply used;
fit <- glm(Dependent~A+B+C+D+E+F+G,data=mydata,family=binomial())
summary(fit)
to get coefficients and the blend (with intercept) eventually improved prediction by 3.2% percent compared to the best metric. Not much but it pretty much confirms it did what I wanted it to did.
Edit: Still I don't understand how the regression figured out the response was dependent on if the independent variables' value are bigger than 0 or not and wasn't related to the actual value itself.
Last edited by permaximum; 09-09-2016 at 03:17 PM.
Tweet |