Force a regression coefficient to be negative

#1
I have a regression problem using variable X to predict Y. That is, Y = c + A*X + error.


For the regression problem, we need that A must be negative to make the regression result meaningful. However, due to existence of unknown noises or unknown factors, our regression sometimes does have a positive results of coefficient A. I am struggling to find out a statistical way to force coefficient A being negative. Do you know any way to do this?


One way I am thinking is that: If the results of A is positive, I remove one point that is most influential to cause A being positive. After removing the point, then do regression again. By doing this iteratively, after removing a couple of data points, the result of A can be negative. Is there any statistical method in research literature supporting my way to remove a couple of data points to force regression coefficient A within a range we prefer (such as negative)? Appreciate your answer.
 

rogojel

TS Contributor
#3
hi,
I think you are making a mistake by mixing regression with physical meaning. The coeficients just give to line/plane that best fits the cloud of measured points that you have - trying to give them a physical interpretation is overstretching the model.

E.g. I had well performing linear models of the weight of a cube with the three dimensions as the IVs. Every child knows that in reality the weight is proportional to the product of the three dimensions, nevertheless a linear combination worked perfectly well as a model for my data,

Regards
 

noetsi

Fortran must die
#4
I am not sure that finding a statistical way to make the numbers positive makes any sense. Substantively, not statistically, you know that negative numbers make no sense. So you simply disregard them. It is often noted that results with high p values are dangerous to interpret anyway and I suspect that nonsensical numbers in your results have very high p values.
 

CowboyBear

Super Moderator
#5
I have a regression problem using variable X to predict Y. That is, Y = c + A*X + error.


For the regression problem, we need that A must be negative to make the regression result meaningful.
I'm not 100% sure what you're saying here. Do you have some substantive knowledge that allows you to be sure that A must be negative? What is it that you're actually modelling here?

One way I am thinking is that: If the results of A is positive, I remove one point that is most influential to cause A being positive. After removing the point, then do regression again. By doing this iteratively, after removing a couple of data points, the result of A can be negative. Is there any statistical method in research literature supporting my way to remove a couple of data points to force regression coefficient A within a range we prefer (such as negative)? Appreciate your answer.
No don't do this, deleting data to get the result you want is not the best way to address this problem.

If you really know that the true parameter A can only be negative, a sensible way to incorporate this knowledge into your model is to estimate a Bayesian regression model with a prior distribution on A that places zero probability on A being positive. E.g., a uniform distribution on [-∞, 0], or a folded normal distribution, or something else along these lines.