+ Reply to Thread
Results 1 to 10 of 10

Thread: Which probability distribution do I use for this, and how?

Hybrid View

  1. #1
    Points: 30, Level: 1
    Level completed: 60%, Points required for next Level: 20

    Posts
    4
    Thanks
    1
    Thanked 0 Times in 0 Posts

    Which probability distribution do I use for this, and how?

    Hi all,

    Im not a statistitian, although I have some reasonable education in mathematics.

    This isn't a homework question (I'm 37 lol), its just something I have been working on as a personal project lately.

    The following graph represents the % payout of an online video slot game. I have collected data points comprising of 25 spins on this slot, recording the return to player (RTP) of each group of spins. This has then been charted, grouping into 10% bands using a pivottable. There are 150 data points forming this data set.



    As you can see, it has begun to resemble some kind of probability distribution. I was thinking initially a Poisson distribution. However, my attempts to map a true Poisson curve to this data have failed.

    So I would like to request some help to:
    a) find out which type of probability distribution this resembles; and
    b) help me estimate the parameters that will enable me to plot the theoretical probability curve onto this experimental data set.

    To my mind, the curve I need will not be symmetrical. As you can see from the chart, there is a significant one sided tail to the distribution. There is no tail on the left hand side because obviously there is a hard threshold at zero RTP.

    Any help appreciated.

    Thanks
    Dan

  2. #2
    Points: 25, Level: 1
    Level completed: 49%, Points required for next Level: 25

    Posts
    2
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Which probability distribution do I use for this, and how?

    cant you use the normal distribution by the central limit theorem since your sample size is big enough? 150 data points right?

  3. #3
    TS Contributor
    Points: 12,227, Level: 72
    Level completed: 45%, Points required for next Level: 223
    rogojel's Avatar
    Location
    I work in Europe, live in Hungary
    Posts
    1,470
    Thanks
    160
    Thanked 332 Times in 312 Posts

    Re: Which probability distribution do I use for this, and how?

    hi,
    this looks like a good candidate for a logistic or loglogistic distribution. You could run a distribution identification on it to identify the parameters. How to do that will depend on what kind of SW you have.

    regards

  4. #4
    Points: 30, Level: 1
    Level completed: 60%, Points required for next Level: 20

    Posts
    4
    Thanks
    1
    Thanked 0 Times in 0 Posts

    Re: Which probability distribution do I use for this, and how?

    I don't have any software other than excel unfortunately. I'm aware some software is free, like R, but I don't know how to use it.

    I just had a quick look at lognormal and that appears to fit the overall shape I'm expecting. Is that what was meant by logistic?

  5. #5
    TS Contributor
    Points: 12,227, Level: 72
    Level completed: 45%, Points required for next Level: 223
    rogojel's Avatar
    Location
    I work in Europe, live in Hungary
    Posts
    1,470
    Thanks
    160
    Thanked 332 Times in 312 Posts

    Re: Which probability distribution do I use for this, and how?

    hi,
    yes, I mixed it up.

  6. #6
    Points: 1,974, Level: 26
    Level completed: 74%, Points required for next Level: 26

    Location
    New Zealand
    Posts
    227
    Thanks
    3
    Thanked 48 Times in 47 Posts

    Re: Which probability distribution do I use for this, and how?

    Perhaps you might want to post the raw data in Excel and we could have a fiddle with it.

  7. #7
    Points: 30, Level: 1
    Level completed: 60%, Points required for next Level: 20

    Posts
    4
    Thanks
    1
    Thanked 0 Times in 0 Posts

    Re: Which probability distribution do I use for this, and how?

    Quote Originally Posted by katxt View Post
    Perhaps you might want to post the raw data in Excel and we could have a fiddle with it.
    That would be cool thanks. I have tried to attach file hopefully it works.

    File contains 2 sheets. One for starburst which is a 'low variance' slot using casino terminology. That contains 150 data points. The 2nd sheet is still being compiled but is for a different slot classed as 'medium variance'. It displays (although still in development) a different shape of curve.

    The key data is in the columm labelled 'RTP', the rest of the columns are the data collected to calculate what the RTP was for that 25 spin group.
    Attached Files

  8. #8
    Points: 1,974, Level: 26
    Level completed: 74%, Points required for next Level: 26

    Location
    New Zealand
    Posts
    227
    Thanks
    3
    Thanked 48 Times in 47 Posts

    Re: Which probability distribution do I use for this, and how?

    For the starburst sheet, if you eliminate points 3 and 11, the data is very well fitted to log normal. For the twin spin daya, log normal in the middle, not so much in the tails.
    The two diagrams are normal probability plots of the logged data.
    I think you would be lucky to get improved two parameter distributions.
    Possibly there is a three parameter distribution which would fit the odd points in the tails.
    And, as John von Neumann famously said "With four parameters I can fit an elephant."
    Attached Thumbnails Attached Thumbnails Click image for larger version

Name:	Starburst.JPG‎
Views:	21
Size:	46.2 KB
ID:	6648   Click image for larger version

Name:	Twin.JPG‎
Views:	13
Size:	46.2 KB
ID:	6649  

  9. The Following User Says Thank You to katxt For This Useful Post:

    danlightbulb (05-30-2017)

  10. #9
    Points: 30, Level: 1
    Level completed: 60%, Points required for next Level: 20

    Posts
    4
    Thanks
    1
    Thanked 0 Times in 0 Posts

    Re: Which probability distribution do I use for this, and how?

    Thats really cool thanks!

    Would you mind explaining what your axis represents in those charts please, as its moved away from my 0-400% RTP. I would want to be able to plot the theoretical curve on the same axis as my sample data, so that it is easily interpreted visually.

  11. #10
    Points: 1,974, Level: 26
    Level completed: 74%, Points required for next Level: 26

    Location
    New Zealand
    Posts
    227
    Thanks
    3
    Thanked 48 Times in 47 Posts

    Re: Which probability distribution do I use for this, and how?

    The vertical axis is the LOG() of the RTP. Log(400) is about 2.6
    The horizontal axis is the standard normal which goes from -3 to 3, because just about all of the standard normal values are between these two.
    For each value in the set you are checking, its quantile (position in the whole set) is calculated and plotted against the corresponding quantile in the normal distribution. If the set is normal (more or less), the quantiles will match and the graph will be straight. Usually, of course you are just checking a data set for normality, not the log of the set for log normality.
    Google "normal probability plot" and look at images. The same idea can be used to check for distributions other than normal.
    Raw Excel can't do normal probability plots, but I have a simple Excel sheet which can if you are interested. kat

+ Reply to Thread

           




Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats