class Patgen |
Class for generating synthetic data sets - HTML documentation is under construction.
![]() | Patgen () |
![]() | create_ran_spheric (char* namefile, unsigned n_class, unsigned n_pattern, unsigned dim, double rangemin, double rangemax, double sigma) |
![]() | create_ran_variable (char* namefile, unsigned n_class, unsigned patmin, unsigned patmax, unsigned dim, double rangemin, double rangemax, double sigmin, double sigmax) |
![]() | create_reading_info (char* infof, char* dataf) |
![]() | create_reading_info_num_pattern (char* infof, char* dataf, unsigned patmin, unsigned patmax) |
![]() | seed |
![]() | make_rand_vector (unsigned dim, double rangemin, double rangemax, vect& centrum) |
![]() | make_pattern (vect& centrum, double sigma, unsigned dim, vect& pattern) |
![]() | make_pattern (vect& centrum, vect& sigma, unsigned dim, vect& pattern) |
![]() | save_pattern (ofstream& fdata, vect& patt, unsigned k, unsigned dim) |
![]() | do_header_info (ofstream& finfo, unsigned v_num, char* filename, unsigned n_class, double rmin, double rmax, unsigned dim) |
![]() | do_info_record (ofstream& finfo, vect& centrum, unsigned k, double sigma, unsigned dim, unsigned n_pattern) |
![]() | do_info_record_sigma (ofstream& finfo, vect& centrum, unsigned k, vect& sigma, unsigned dim, unsigned n_pattern) |
![]() | do_pattern_total (ofstream& finfo, unsigned pat_tot) |
![]() | read_header_info (ifstream& finfo, unsigned& v_num, unsigned& n_class, double& rmin, double& rmax, unsigned& dim) |
![]() | read_class_record (ifstream& finfo, unsigned& target, vect& centrum, vect& sigma, unsigned& num_pattern) |
Class for generating synthetic data sets. It generates synthetic data for training and test sets. The data are stored in files with the appended suffix .data. Files with appended suffix .info are also generated.Data are generated by clusters that can be associated with classes; both the number of clusters, the number of classes and the dimension of the examples can be determinated by the user. Each cluster is generated according to a multivariated gaussian distribution, and its center and covariance matrix can be determined by the user in different ways.
Format of the file .data:
feature1, feature2, ... , featureN, class
Features are double, class unsigned. Each row corresponds to an N-dimensonal example with its associated class.Format of the file .info
They are composed by an Header, containing general information about the data, and a Cluster record for each cluster of data.
Header format: --> one for each cluster
Version: (unsigned) (actually fixed to 1)
File_name: (string)
Class_number: (unsigned) --> number of the classes
Data_range: min max (double double)
Pattern_dimension: dim_pattern (unsigned)
Cluster record format: --> one for each cluster
Class: class_number (unsigned)
Centrum_coord: x1 x2 ... xdim_pattern (double)
Standard_deviation:s1 s2 ... sdim_pattern (double)
Pattern_number: number_of_pattern_in_the_class (unsigned)
Alphabetic index HTML hierarchy of classes or Java