Research Article
Ahmad LG*,Eshlaghy AT,Poorebra
Abstract
Objective: The number and size of medical databases are increasing rapidly but most of these data are not analyzed for finding the valuable and hidden knowledge. Advanced data mining techniques can be used to discover hidden patterns and relationships. Models developed from these techniques are useful for medical practitioners to make right decisions. The present research studied the application of data mining techniques to develop predictive models for breast cancer recurrence in patients who were followed-up for two years. Method: The patients were registered in the Iranian Center for Breast Cancer (ICBC) program from 1997 to 2008. The dataset contained 1189 records, 22 predictor variables, and one outcome variable. We implemented machine learning techniques, i.e., Decision Tree (C4.5), Support Vector Machine (SVM), and Artificial Neural Network (ANN) to develop the predictive models. The main goal of this paper is to compare the performance of these three well-known algorithms on our data through sensitivity, specificity, and accuracy. Results and Conclusion: Our analysis shows that accuracy of DT, ANN and SVM are 0.936, 0.947 and 0.957 respectively. The SVM classification model predicts breast cancer recurrence with least error rate and highest accuracy. The predicted accuracy of the DT model is the lowest of all. The results are achieved using 10-fold cross-validation for measuring the unbiased prediction accuracy of each model.