# Cope with distributions that may take negative numbers

#### davidvansan

##### New Member
Hi to everyone,

I would like to know how to cope with distributions that may take negative numbers when it is physically impossible. For instance, the lifetime distribution of an equipment, which obviously cannot be negative, but sometimes could give negative numbers. What can I do to avoid that?

David

#### Dason

I don't quite understand your question. Can you elaborate a little bit more?

#### davidvansan

##### New Member
I mean, what precautions should I take to deal with distributions that might take negative numbers, and these negative numbers does not make sense.

#### Dason

Maybe if you provide a concrete example and what you think is causing an issue it would be easier to address the issue. But if you're modeling data that must be positive then maybe it would be a good idea to just use a distribution that only has support on the positive numbers?

#### davidvansan

##### New Member
Let's think that the lifetime of an equipment is normally distributed with a mean of 5 months and a standard deviation of 1,5 month. Sometimes it can take negative numbers, that's impossible. Could I do something to avoid this situation? Or what should I do with this data?

#### Dason

Then it's not actually normally distributed. But if the normal provides a suitable approximation for that case then I don't really see the issue. Sure it's possible for a N(5, 1.5^2) to go negative but it's a very low probability. Once again you haven't really said *why* this is an issue for you. Are you just bothered by the possibility that a normal random variable can be negative and your variable has to be positive?

Last edited:

#### GretaGarbo

##### Human
You could use a gamma distribution or lognormal distribution. Then it can not take any negative values. And for larger means it will kind of look like a normal distribution. It will look kind of symmetric, although strictly it is not.

#### hlsmith

##### Less is more. Stay pure. Stay poor.
This seems more like a data quality issue. Are you getting impossible data values? If so, you should check your protocols and the functions related to data generation.

Otherwise refer to GG and Dason's posts if you are just theorizing.

#### davidvansan

##### New Member
As you are saying I'm just theorizing and I would like to avoid o take some precautions for negative numbers, for example in a normal distribution.

#### Dason

You still haven't actually told us what the actual concern is. Are you generating random numbers from the distribution and if you get a negative then everything blows up? Are you just worried because theoretically an normal can take a negative so using any method/model that assumes a normal distribution for the (conditional) response allows for the possibility of a negative result and you know that can't happen (and if so why does what the model theoretically allow matter so much to you if it only allows it with an almost non-existent probability)?

The usual concern would be if you were using a method that *requires* the response be positive but in your data every now and then you might have a negative (or zeros) either due to impure data or something else weird going on. That is a different issue entirely. But I think that isn't what you're worried about and you're jumping the gun on being concerned over the theory that the model theoretically allows for negative values even though you know they shouldn't/can't happen.

#### davidvansan

##### New Member
The usual concern would be if you were using a method that *requires* the response be positive but in your data every now and then you might have a negative (or zeros) either due to impure data or something else weird going on.

That's my problem.

#### Dason

Maybe if you provide a concrete example
So like are you using a poisson regression and sometimes you have negative counts or something? Please provide some actual details on what you're doing. I've been trying to get actual details because to be blunt it's still not completely 100% clear to me what your issue is because the way things have been worded makes it a little ambiguous. If you provide a concrete example that will clear up any and all issues and then we can move on from there.

#### davidvansan

##### New Member
the lifetime of an equipment is normally distributed with a mean of 5 months and a standard deviation of 1,5 month.

Can I truncate the function just to avoid that negative numbers?

#### hlsmith

##### Less is more. Stay pure. Stay poor.
Well I am back again, you need to figure out why you have negative numbers and prevent them from happening. You could almost treat them like missing data and see if they are MAR (Missing (effed-up) at random) or MCAR, so you can drop them.

What type of analyses are you running, because negative won't prevent you from using standard normal, etc. At least for many distributions. You would just want to know why they are effed-up and if left in dataset are they leverage/outliers messing up your model.

P.S., I support Dason in that your example is too crude.