Résumé

The Recursive-Rule eXtraction (Re-RX) algorithm family includes the Re-RX algorithm, the Re-RX algorithm with both discrete and continuous attributes (Continuous Re-RX [1]), the Re-RX algorithm with J48graft [2], Re-RX with J48graft combined with Sampling Selection Techniques (Sampling Re-RX with J48graft [4]), and the Re-RX algorithm with a trained neural network (Sampling Re-RX [3]). In this study, we compare the performance of the Re-RX algorithm family with various previous algorithms. One issue that always remains important in rule extraction is Pareto optimality, or in other words, an ideally balanced trade-off. In rule extraction, the trade-off is between the classification accuracy and interpretability of extracted rules. Our goal is to obtain a wider viable region for the Pareto optimal curve that will enable improvements in both the accuracy and interpretability of extracted rules. We vividly demonstrate Pareto-optimal curves between the accuracies and number of rules obtained for German and Australian datasets by 10 runs of 10-fold cross validation of the Re-RX algorithm family and those obtained using other algorithms. The Re-RX algorithm family has proven effective for extracting concise and interpretable rules from medical [1, 2, 4] and financial [3] datasets.

Einzelheiten

Aktionen