dofold |
Application for the automatic construction of multiple folds of data.
The application dofold partitions the data in k-fold where k is a user defined integer, according to the cross-validation technique. The that are subdivided in k fold and are generated test sets containing one fold of data and traininig sets containing the remeining k-1 folds of data; this procedure is repeated k times using each time a different fold as test set. The partition is performed through a random extraction of the data.
Usage: dofold datafile [options]
datafile (string) : data file that must be subdivided in folds.
Options:
-nf unsigned number of the folds (default = 10) -na unsigned number of attributes (default = 2) -name string name of the fold files (without suffix). If omitted datafile is used instead. Examples:
- Generating 10 folds from the dataset data:
dofold data
It generates 20 data sets: 10 training set data.f1.train,...,data.f10.train and 10 test set data.f1.test,...,data.f10.test, assuming that data is a dataset with two attributes.- Generating 5 folds from the dataset data:
dofold data -nf 5
It generates 10 data sets: 5 training set data.f1.train,...,data.f5.train and 5 test set data.f1.test,...,data.f5.test
- Generating 5 folds from the dataset newdata with samples having 12 attributes:
dofold data -nf 5 -na 12
It generates 10 data sets: 5 training set newdata.f1.train,...,newdata.f5.train and 5 test set newdata.f1.test,...,newdata.f5.test
- Generating 10 folds from the dataset data and storing them in files named mydata:
dofold data -name mydata
It generates 20 data sets: 10 training set mydata.f1.train,...,mydata.f10.train and 10 test set mydata.f1.test,...,mydata.f10.test.Output: The applications outputs 2k files: k training sets namefile.f1.train, ..., namefile.fk.train and k test sets namefile.f1.test, ..., namefile.fk.test, where k is the number of the folds.
Alphabetic index Hierarchy of classes