I have a data series that result from a long (2,950) series of discrete events. Roughly half of the events ended with a value of exactly -1, the other half are continously distributed between 1 and positive infinity. There are no values between 1 and -1. A histogram would look something like this:

Code:

```
9
8 x
7 x
6 x
5 x
4 x x
2 x x x
1 x x x x x
-1 0 1 2 3 4 5 6 7
```

It was suggested to me to partition the data set into groups of 50 random data points and sum the values within each partition. That would (should?) result in a normal distribution, so that I can apply typical descriptive statistics and come up with an expected value (mean of the sums) and a confidence interval based on the distribution of the sums.

My issue is that I can't come up with a defensible reason why this will work. I don't remember anything in statistics about summing (or average, I suppose, since the sample sizes are equal) a subset of a non-normal distribution in order to create a normal distribution. I'd need to know whether 50 is the right number, whether to group the data points into groups of 50, or just randomly selecting 50 points from the series over and over and how many times I should do that.

Any help would be greatly appreciated. Am I approaching this the wrong way? Am I using the wrong language to describe what's happening here?

Much oblidged,

John