Well, simulated data is never the real or actual data itself (hence it's name "simulated"). But a good simulation study will need the computer to generate data that looks reasonably close to data you'd find in your area of expertise or some idealized cases where you have extra control over the characteristics of the data (like non-normality, missing data, etc.) so that you can see what effect these characteristics have on the statistical methods.

For the case of your article there's this line that reads:

*...data similar to the observed data were simulated with the expected values of counts given by Eq. (4) and with the Poisson error inflated by the factor 2.75 using the zero inflated count model where an observed count is either the value zero with probability p or a random value from a Poisson distribution with probability 1 − p.*

Which implies they are using parameters estimated from the real data as their population parameters in their simulation study.

Because each data set is *created* by the computer with the population and distributional characteristics described above. Once the distribution from which the data will be sampled is defined in the computer, you can ask it to give you any number of random draws that will become the datasets to analyze. So each dataset is "new" in the sense that it's being sampled from the distribution defined by the authors.

I'm not sure I follow what you're asking here. Do you mean which method was used to *analyze *each dataset? Or what method is it used to *generate *each dataset?

Your response is very insightful thanks, but I want to ask that how should I then simulate the independent variables;

Suppose I have a model and I use the coefficients to simulate the outcome; How then should I simulate the independent variables because I don't know how they are distributed?

I was writing the same thing in SAS to simulate it.....this is what I am talking about..... (You don't have to follow the codes just check the comments)

%let N = 287;

%let nCont = 4;

data SimReg1(keep= Y lambda x : );

call streaminit(54321);

array x[&nCont];

array beta[0:&nCont] _temporary_ (1.6362 -0.6134 -0.4914 -0.1328 0.0324); /* This part helps include the coefficients from some model */

do i = 1 to &N;

/*

x[1] = rand("Bernoulli",0.5);

x[2] = rand("Bernoulli",0.5); /* How should I simulate this part? What is includede in the code are just something I was */

x[3] = ceil(3 * rand("Uniform")); /* trying but certainly I don't know how these variables are distributed. */

x[4] = rand("Bernoulli",0.5); */

eta = beta[0];

do j = 1 to &nCont;

eta = eta + beta[j] * x[j];

end;

lambda=exp(eta);

Y = rand("Poisson",lambda) ; /* This simulates the dependent variable in counts */

output;

end;

run;