# Thread: What model - combining data

1. ## What model - combining data

Hi,

I have attached a data file which shows 4 ways of describing the same thing. Each way is a small table of 5 numbers. If I explain whats being shown I hope someone can advise me on what modelling process or what steps I can go down to make it more accurate.
At the moment Its something I visually look at and seem to understand to come up with my own thought process as to what will happen. Is there a model / technique that can beat what my thought and understanding ?

This data shows one horse race and each row is a horse. If you jumps to columns U , V , W , X , Y each column explains what position the horse will take up in a race. pos1 nearer the front and pos 5 nearer the back.
I could do a monte carlo simulation which will put these into better probabilities and give me a probability from these numbers.

If we look at columns P,Q,R,S,T, This shows what the position the horse runs in when it has a better run. IE, has a better race.

In this race the better run seems fairly consistent with where the horse runs normally and hence I don't need to change much in this race. Sometimes I see a horse that seems to want to run in a different position that the first set of figures shows and I wondered how I could mix / combine these two tables / data to give me a more accurate numbers than just using the first data.

If I then go another step into the numbers. In column E,F,G,H,I It shows the same figures but for the jockey only. Horse1 has a jockey that compliments what the horse wants to do, but horse2 actually suits the jockey to run in position 2 and not position1. I wondered what model / technique or what steps I could do to see how table E,F,G,H,I effects the data in U , V , W , X , Y.

How do I show these figures to a model and how do I add the result to the data ? do I produce 1 result column with 100 being pos1, 0 pos5 etc ? do I produce 1 model to predict pos1 probability and then do the same for each pos ?

What steps do I take ? At the moment I can read the files and interrupt them, but would like it to be an automatic process.

Can anyone suggest something / steps ? something I can read along the same lines ? Is there someone who may want to mentor me through it ? happy to pay someone on line who can open this modelling world up ?

Sports data is really interesting

Rgds
Models

2. ## Re: What model - combining data

Hi,

Is there no one who can push me into one direction ? If I used regression using the main 5 positions, how would I designate the result ? do I give the result as 5 numbers 1,0,0,0,0 as this result was position1 ?

would I use a simple scale 1 to 5 where 1 is position1, 2 position 2 ? would this be enough definition ?

Should I use 0 to 100 ? 100 is position1, 80 is position 2 etc ?

Im looking for someone to just let me know the best way of showing the result and then try regression first ?

Rgds
Models

3. ## Re: What model - combining data

Hi,

Lets see if i can explain what Im going to do and hopefully someone can think of a better way or something different from regression.

Ive got 5 buckets of different sizes and distances away from loads of people throwing a ball.

The buckets are all connected so the ball has to land in one of the buckets. If you aim for one near the back you don't have much chance of hitting the front bucket and vice versa.

I have therefore have results for say 100 people.

person1 was very random and his results shows 20,20,20,20,20.......

If we then did another test and changed the throwing distance.

I want to use both results to predict what bucket someone will throw a ball to if the distance was now between the first two distances.

I was going to have a result column for each basket and thus do a regression one it going into the first basket, then a regression on it going into the 2nd basket etc etc ?

So my data actually looks like following:

instead of 2 tables I actually have 4 tables of data.

30,20,0,10,10, 20,20,30,10,10, 50,50,0,0,0, 30,20,10,30,10 so I was going to put a result down as this person throw the ball and it hit bucket 1 so I was going to have a result of bucket1, bucket2 etc so result would look like 100,0,0,0,0

SO i have 10,000 rows of 4 X 5 sets of numbers and can put the actual result down of each throw and can show a bucket 2 result as 0,100,0,0,0

I was going to do regression of one set and then add a set so 10 numbers and see if it add information or adds noise etc.. then add each combination up running each result one at a time.

Im expecting one set to add too much noise, then another one to actually be not worth much information and possibly 2 sets which together are better than any one of them ? I get a formula to use adjacent the data to give me a better prediction of probability ?

any help appreciated.

models

I wondered if anyone knew of a better method ?

 Tweet

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts