Why do you think that a smaller sample is better?my sample size is very high (5.000 observations) compared to the number of independent variables (around 10-15). So I thought, that it's better to take a random sample with a smaller number of observations.

Looks like you better leave them out of the analysis.What could make it a bit more complicated:

- for 3 independent variables 30-60% of the data is missing

Tht is not necessarily a problem.- the variables (dependent and independent) are skewed.So you leave some correlated independent variables out,- I expect some multicollinearity

and/or you provide a large sample size in order to reduce

the inflated standard errors associated with multicollinearity.

With kind regards

K.