EN FR
EN FR




Software
Bilateral Contracts and Grants with Industry
Bibliography




Software
Bilateral Contracts and Grants with Industry
Bibliography


Section: New Results

MOCA-I: discovering rules and guiding decision maker in the context of partial classification in large and imbalanced datasets

Participants: Julie Jacques, Laetitia Jourdan, Clarisse Dhaenens

In this work we focus on the modeling and the implementation as a multi-objective optimization problem of a Pittsburgh classification rule mining algorithm adapted to large and imbalanced datasets, as encountered in hospital data. Indeed hospital data comes with problems such as class imbalance, volumetry or inconsistency, and optimization approaches have to take into account such specificities. We present MOCA-I, an adaptation of MOCA dedicated to this kind of problems. We propose its implementation as a dominance-based local search in opposition to existing multi-objective approaches based on genetic algorithms. We associate to this algorithm an original post-processing method based on the ROC curve to help the decision maker to choose the most interesting set of rules. Our approach is currently compared to state-of-the-art classification rule mining algorithms (both classic approaches such as C4.5 and optimization approaches), giving as good or better preliminary results, using less parameters. Moreover, our approach has been compared to C4.5 and C4.5-CS on a real dataset (hospital data) with a larger set of attributes, giving the best results. The complete evaluation is still going on.