Subfolder DATASET_I:
HPO.graph10.organ.UA.rda: object of class graphNEL (R package graph) that represents the hiearchy of terms of the HPO subontology Phenotypic abnormality. This DAG has 2154 nodes (HPO terms) and 2641 edges (between-term relationships).
HPO.ann10.organ.UA.rda: annotations table in which the transitive closure of annotation was performed. Rows correspond to Entez Gene ID and columns to HPO terms (subontology Phenotypic Abnormality). If T represents the annotation table, i a gene and j an HPO term, T[i,j]=1 means that the gene i is annotated with the term j, T[i,j]=0 means that gene i is not annotated with the term j. All the HPO terms having less than 10 annotations has been pruned. Size: 19430 X 2154.
Scores.eav.score.p1.a2.M.hpo.ann.organ.all.10.rda: flat scores matrix representing
the likelihood that a given gene i belongs to a given class j: higher the value higher the likelihood. Rows correspond
to Entez Gene ID and columns to HPO terms (subontology Phenotypic
Abnormality). This flat scores matrix
was obtained running RANKS package. Size: 19430 X 2154.
Subfolder DATASET_II:
HPO.graph10.string.v91.rda: object of class graphNEL (R package graph) that represents the hierarchy of terms of the whole the HPO ontology. This DAG has 2445 nodes (HPO terms) and 3059 edges (between-term relationships).
HPO.ann10.string.v91.rda: annotations table in which the transitive closure of annotation was performed. Rows correspond to Entez Gene ID and columns to HPO terms. If T represents the anntation table, i a gene and j an HPO term, T[i,j]=1 means that the gene i is annotated with the term j, T[i,j]=0 means that gene i is not annotated with the term j. All the HPO terms having less than 10 annotations has been pruned. Size: 3412 X 2445.
Scores.holdout.par.type.0.C1.W.hpo.rda: flat scores matrix representing the probability that a given gene i belongs to a given class j: higher the value higher the probability. Rows correspond to Entez Gene ID and columns to HPO terms. This flat scores matrix was obtained running a multicore version of LiblineaR using doParallel and foreach R packages. Size: 3412 2444. NOTE: the fake root node "All" (HP:0000001) has been removed from the flat scores matrix.
test.set.index.rda: vector of integer numbers corresponding to the indices of the elements (rows) of scores matrix to be used in the test set. Useful only in holdout experiments. Length: 608.
