I have a general understanding of the difference between a population (set of entities under study) and a sample (a subsection selected from the population). However, I've been doing some work in PPC (Pay-Per-Click) and AdWords recently, and can't seem to grasp the population/sample difference in regards to that.
For example, let's say there are two Google AdWords ads. Users will click the ad and it takes them to a form which they can fill out if they choose to. Therefore, I have data on the number of clicks and the number of forms filled out. The question I'm trying to answer is which Ad was more effective at getting more clicks and forms filled out.
Ad1 - 300 105
Ad2 - 320 100
Initially, I thought that my sample was the two ads (Ad1 and Ad2), but that wouldn't be right as I'm really examining the number of clicks and forms filled out. So it would seem that the population that I'm examining is the clicks and forms associated with the two ads (Ad1 and Ad2) and my sample would be the number of clicks and number of forms filled out. Is that right/wrong? Thus, would clicks and forms be considered two seperate samples taken from the same population? Or is my population the same as my sample in this case?
I think you don't actually care about clicks at all: you want your add to result in a filled out form; if it results in a click but no filled out form, it was not effective. I also think you are missing a number that you do care about: the number of impressions. If ad A resulted in 5 filled out forms from 50 impressions, it is better than ad B that resulted in 10 filled out forms from 200 impressions.
Basically, the model is that each ad corresponds to a Bernoulli distribution, with a particular probability that an impression results in a successfully filled out form. Your sample is the number of impressions N and the number of successful forms N1 that resulted. Your best estimate of the probability of success in the underlying population is p = N1/N, and your error bar on that estimate is sqrt(p(1-p) / N). So, in my two examples, p_A = 0.100 +/- 0.042 and p_B = 0.050 +/- 0.015. I can't do that computation from your data because you don't give me the number of impressions for each ad.