Résumé

Selecting predicitve gene pools from thousands of gene expression values is one of the main tasks in microarray data analysis. For this purpose multivariate techniques have proven much better, in terms of predicitve value and biological relevance, than univariate techniques as they are able to capture relevant relationships and interactions between genes. An additional goal for gene-expression profiling is finding models that, besides being predictive, are also understandable so as they can provide some insight on the underlying mechanisms. Models based on fuzzy logic might, potentially, exhibit both characteristics. However, accuracy and interpretability are usually contradictory objectives, and one must accept a trade off between them. Indeed, literature shows that the approaches based on fuzzy logic may be divided in two groups: accurate but complex models (i.e, with many rules using many variables per rule) on one hand, and models with only few short rules (thus, interpretable) but exhibiting limited accuracy. We present in this paper the application of Fuzzy CoCo, our cooperative coevolutionary fuzzy modelling approach, in order to deal efficiently with the accuracy-interpretability tradeoff. Fuzzy CoCo is able to find very compact fuzzy models, in terms of number of rules and number of variables per rule, while still exhibiting high predictive power. To validate the performance of our approach, we tested Fuzzy CoCo on four known data sets addressing each one a form of cancer: Leukemia, colon, lung, and prostate. We compared our results-in terms of maximum number of rules, number of variables per rule, and accuracy-with those of other similar works (i.e., based on fuzzy logic). Our models reached similar or better accuracy while being considerably smaller.

Détails

Actions