I have time series data of about 150 sample and 8 variables. It is used to generate a interaction network, whose exact structure is not known yet.

I can propose a model using two ways:
First approach, use 100 variable as training (derivation) data and 50 for testing (validation) set. This results in fitting 8 nonlinear regression models, which in turn results an interaction network involving all the variables.
Second approach, use whole data for modeling. This also gives me an interaction network. However, some of the interaction are different from those obtained with approach one.

Which one is a better way?
Though i think, approach one seems promising because it has promising validation results. However, in doing so we are loosing the initial information which could be used during model development.

I think approach one could be useful for simulated data where one knows the final structure.
Since the exact information about the interaction is not known, wouldn’t it be better to use second approach so that most of the information in the data could be used.

It would be great if anyone can direct me to related research article/case studies.