Defining cases in Statistica

#1
Could anyone knows how to define training, testing and validation datasets in Statistica Neural Networks and TimeSeries Forecasting? I want to separate the data according to temporal time sequence. First the training (400 cases) then the testing (200 cases) and final the validation dataset. I dont want to separate the data randomly (option of statistica). Any help is much appreciated
 

JenB

New Member
#2
On SANN Data Selection dialog - Sampling tab, select the "Subset variable" option as your sampling method. You will then need to specify a spreadsheet variable and codes to identify which cases are used for the various samples (Training, Testing, or Validation).

This subset variable can be easily added and set to support the needed selection logic (e.g., if case <= 400, set to training, else if <= 600, testing, else …) using a spreadsheet formula:

=iif(v0<=400, "TRAIN", iif(v0 <= 600, "TEST", "VALIDATE"))