Number of responses above a threshold after considering survey error

#1
I am interested in how many people in my survey of people earn between $50,000 and $54,999. I construct a paper survey which asks for income in $5,000 bands.

200 people respond that they earn between $45,000 and $49,999.99
20 people respond that they earn between $50,000 and $54,999.99
No-one in my survey earns more than this.

I also know (from other work) that there is a 10% chance that any person gets their income wrong by one band – i.e. it will be $5,000 too high or too low.

Am I justified in saying that the number of households in the $50,000-54,999.99 bracket is more than the 20 actually measured in my survey? Should this, in fact, be 29?
My logic is as follows: If 10% are getting the band wrong, then 20 of the 200 telling me they earn between $45,000 and $49,999.99 actually earn $5,000 more or less than this, and (assuming equal likelihood of getting the band too high or too low) 10 of these actually earn between $50,000 and $54,999.99. However, only 2 of the 20 stating that they earn between $50,000 and $54,999.99 will get the band wrong, and only 1 of these actually meant to select the $45,000-49,999.99 band. There should, therefore, be 20 + 10 - 1 households (a total of 29 households) in the $50,000 - $54,999.99 band.

Appreciate thoughts on whether this logic is correct? Is anyone able to point me to a book / paper where this issue is discussed (which must be quite common in survey work)?

Thanks.
Jake
 

hlsmith

Not a robit
#2
I haven't had any experience with this, but if 10% could be high or low wouldn't it be anywhere from:


18-38 people


since 20 * 90% is 18 and assume none from below trickle in.
and since (20 *90%) + (200 x 10%) = 38 since you have lower band people trickle in.
 
#3
Yes, I guess that's correct. The phenomenon I am interested in, and that still stands, is that the actual number of people with a salary over the threshold ($50,000) is likely to be HIGHER than I have measured once I take into account the 'pressure' introduced by the shape of the distribution around this threshold and survey error. I'd really appreciate it if anyone is aware of where this might be discussed?
 

hlsmith

Not a robit
#4
I haven't seen this anywhere, but it is similar to sensitivity analysis. Sensitivity analysis is a set of procedures conduct at the end of a study which attempt to account for measurement or selection bias along with confounding. There is a book by Fox, Lash, and Fink that discusses how to correct some statistics due to these threats.


http://www.springer.com/us/book/9780387879604