My aim is to evaluate the performance of parametric and non-parametric Bootstrap under specific circumstances; and particularly their ability to capture the population mean within their produced confidence intervals -from now and on when talking about "coverage" I mean the ability of the confidence interval to capture the population mean. In order to achieve that, I have created some macros (VBA scripts) which in combination with functions integrated in the spreadsheets, calculate the bootstrap confidence intervals. Let me explain the steps of the aforementioned procedure in order to get a more clear view of what is happening and I will finalise this post with my questions.

The aforementioned "circumstances" include the investigation of Bootstrap's performance for the Normal, LogNormal, Gamma, Negative Binomial and Poisson distributions, for sample sizes of 5, 20 and 50. Furthermore, let me add that anything I am explaining further to this sentence, corresponds only for the non-parametric Bootstrap process (as this is where my "problem" is situated), and if not I mention it. Let's take the case where we are testing the coverage for the Gamma distribution in order to simplify the comprehension of the process. The original samples (initial samples from which the Bootstrap samples are drawn/created) are drawn from a Gamma distribution with a and b parameters defined by myself, therefore we know the population mean. This is accomplished through the use of the inverse Gamma Excel function, with inserted values the respective demanded indices being a, b and RAND() function for the probability x. Then I just move this numbers to a different column as fixed values through the Offset statement in VBA, so that I can manipulate them afterwards, as indicated at the next sentence. As we are talking about the non-parametric Bootstrap, I am using in a different column, the VLOOKUP function, which with a little modification acts as the Empirical Distribution Function (EDF), upon which the Bootstrap samples are drawn from the original samples.
Let me state at this point, that inside the VLOOKUP function, I am using the RAND() function as well -needed in order to act as the EDF. Therefore, 1000 Bootstrap samples are created of the same size as the original sample (I have produced 1000 bootstrap samples as literature indicated that this is the "default" number for producing accurate confidence intervals). Then, I calculate for each bootstrap sample its mean, and store them in a column. I am specifically interested in assessing the performance of the Percentile bootstrap confidence intervals. Therefore, I am using the PERCENTILE.EXC function for 0.025 and 0.975, on the column where I have the Bootstrap means. So, through that way I have calculated the bootstrap confidence intervals based on the aforementioned percentiles. Then, through a simple IF function, I check whether my initial population mean is captured in the aforementioned intervals or not. The whole aforementioned process, signifies a single Bootstrap iteration. Imagine that I have done this procedure at least 100 times (of course VBA does it for me). Therefore, the percentage of the times that the confidence intervals capture the population mean, divided by 100, or any number of Bootstrap iterations, would give the total coverage percentage, thus the performance of the Bootstrap process.

Now let's get to the essence of the post. As far as I have understood, the greater the sample size, the better the coverage should be, therefore, for the 50 sample size case, even if I would not get a 95% coverage, I should at least have a better coverage than the 5 and 20 sample size. Well, this is not happening. Indicatively for the Gamma case, the 5 sample size has a coverage percentage approximately 65%, the 20 sample size 75%, and then the 50 sample size 55%. The weird part is that this pattern is actually the same in every other distribution that I examine, not only in the Gamma. Let me mention at this point that the parametric bootstraps performed indicated the expected coverage results, agreeing with the theory, the greater the sample size, the better the coverage. For the non-parametric results though, this pattern contradicts to the theory, or at least from what I have read. I am confident that my VBA script and spreadsheet functions are correct. Well, the fact that the parametric results are correct implies that my models (spreadsheet function and VBA scripts) are correct -as they are all quite similar, with the only thing changing being the use of EDF at the non-para case and estimation of the parameters of the population distribution, in order to create an estimate distribution for parametric. Therefore my initial thoughts that some of the Excel functions that I am using (RAND(), VLOOKUP, PERCENTILE.EXC) are somehow altering in a way my results is kind of negated according to the previous sentence. However, I have read that the RAND() function can generate the same values after a great number of generated numbers, and to be honest the non-parametric case is using almost two times more the RAND() function. Still though this is not persuading me, the discrepancy of the 55% coverage in the 50 sample size case from the 95% which should be, is too big to be justified from the RAND()'s recurring values.

Further to that, checking myself all of my results (as I store in a different sheet all the information from each Bootstrap iteration), I figured out that while initially my conclusions are probably incorrect, they make some sense as the confidence intervals did not cover the population mean because the widths were significantly smaller, which is of course reasonable as the variance is smaller. However still this is not enough to persuade me, as a greater sample size, is definitely approximating the population mean. This feels like a vicious circle in my mind.

While searching for a solution in forums, a user has written down this sentence about the non-parametric bootstrap, and let me quote it: "It makes fewer assumptions and provides less information than the parametric bootstrap. It is less precise when the parametric assumption is true but more accurate when it is false". Then I thought, this might be an answer to my problems. The original samples were drawn from specific distributions, thus the parametric assumption they are following is 100% true. It sounds reasonable, but is there any paper mentioning that, or proving it mathematically somehow?

But the struggle is not over yet. Even if we assume that the aforementioned statement is correct, I tested the same coverage procedure in the Maple software -which has a build in Bootstrap function, and indicated coverages of approximately 95% for sample sizes of 50. This negates the previous assertion, doesn't it?

To sum up, is there any chance that the Excel functions that I am using are somehow altering the results? Furthermore, is there any chance that what the other user said about the inefficiency of the non-parametric bootstrap for accurate parametric assumptions is correct? Is there any paper proving the aforementioned? But again, how can this be true when Maple indicated the expected coverage percentage? I honestly want to believe that one of the previous statements is true, but I do not have enough evidence.

I tried to be as much accurate as I could, I would be grateful to give you more details if needed. Thank you in advance for your time, I appreciate any response, this issue has been troubling me for weeks but I have not found any sound answer.