What is a good approach for sampling the data?

CK25

New Member
#1
Hello

Aim: Sample Y value using available data.

Let's say I have a function Y which depends on variable a and b.
1. Data from a is simulated
2. Data from b is from experiments
3. Y is always larger than 1.

Y_a_b.jpg a_b.jpg
In this image, these are data provided by a journal and Y value is determined from an analytical model provided by the journal. As you can see, the data at high a and b values are sparse and little.
Y_b.jpg
Initially, I have tried to bin the data into segments and fit a Gaussian distribution into the data which looks like this. The colour plot is a Gaussian fit.
a_b_var.jpg
After I have done the binning, I tried to obtain the variance for each bin.

With the analytical model as mean value and the variance for a and b, I could then sample Y using my own simulation of a and experimental value b. As you can expect, if my a and b value is high, this would relate to a large variance. My sampled Y value could be a negative number. So, I have tried to sample my Y with a truncated Gaussian with condition >1. However, this is still not a good result as expected.

Is there any suggestion of how I could sample Y with these sparse data that is available?

Thank you.