Model Cross-Validation Issue: Comparing models w/ and w/o error

Hi folks: I'm trying to apply cross-validation of models (e.g. the paper "On Cross-Validation of Bayesian Models" by Alqallaf and Gustafson) in the context of count data and have run into an interesting problem.

My understanding of how such cross-validation should work is that one draws replicated y values (y.rep) from the posterior predictive distribution and then compares these replications with the actually observed y's (y.obs). I'm interested in fit, namely the sum of squared deviations (SSE) between y.rep and y.obs.

If I understand the generation of y.rep correctly, this includes, if it is present, any random error. That is, let's say I have the following model, with X & B as matrices and normal error:

y ~ poisson(lambda)
lambda=exp(XB + Normal(0,sd))

Then, my understanding is I would draw all parameters (B, sd) from the Bayesian model (on training data) and, for each draw, would calculate XB (X from validation data) and add to it a randomly drawn normal error of std deviation sd.

Now, the problem. Let's say I compare the above model to the same model, but without the normal error:

y ~ poisson(lambda)

Because I'm adding noise to y.rep in model 1 isn't it at a disadvantage to model 2? Model 2's y.reps directly reflect the best mean values, while model 1 adds noise to its best means. I've actually tried this and it seems there is a huge disadvantage.

So, should I instead be comparing the *expected value* of exp(XB + Normal(0,sd)) in Model 1 with exp(XB) in Model 2, even though the former is not a posterior predictive distribution? But, when I do model checking by comparing the distribution of y.rep relative to y.obs (say the .95 quantile or % of data at zero) I should use the posterior predictive distribution for each?
Well, maybe I can begin to answer my own question--hopefully someone will find this helpful. I set up some Monte Carlo data, so I'd know what the ground truth was. The data was from a poisson with error distribution, with parameters similar to what I was seeing from my real data.

I then calculated a cross-validated msqe for an estimation model of poisson with error and another estimation model of poisson without error. For the model of poisson with error, I calculated the msqe of exp(XB + error), where error is normal random with standard deviation determined by draws from the posterior predictive distribution. This is what one would imagine should be used based on Alqallaf and Gustafson. For the same model, I also calculated msqe using the *expected value* of exp(XB + Normal(0,sd)).

What I find is that the first msqe for the poisson model with error (calculated from exp(XB + error)) is far larger, significantly so, than the msqe of the poisson model without error. Thus, this comparison suggests the the correct model is poisson without error, even though the correct model actually is poisson with error.

The msqe from the expected value of exp(XB + Normal(0,sd)) is slightly smaller, non-significantly different, than the msqe of the poisson without error model. Cross-validation doesn't seem to be a good way to distinguish between these models.