A Comparison between Support Vector Machine based on Kmeans Clustering (SVM-Kmeans) and Support Vector Machine based on Recursive Feature Elimination (SVM-RFE) for Breast Cancer Diagnosis

Keywords: [Support Vector Machine (SVM)], [Recursive Feature Elimination(RFE)], [K-means Clustering] , [Chi-square] , [feature selection] , [Breast Cancer Diagnosis]


In this work, we emulate an architecture that combines the SVM classifier with the Kmeans clustering algorithm (SVM-KMEANS) and the method that uses SVM with Recursive Features Elimination (SVM-RFE), for breast cancer diagnosis.

Both works were compared to assess the accuracy, taking two datasets for breast cancer: Breast Cancer Wisconsin Diagnostic dataset (WDBC), and Wisconsin Prognosis Breast Cancer (WPBC), obtained from UCI machine learning repository.

Taking into account only the WDBC dataset, it is observed that the method SVM-KMEANS, reaches the maximum level of accuracy (98.25%) with only 4 best features, while the method SVM-RFE reaches the same level of accuracy with 30 features. It is concluded that the SVM-KMEANS method is better than the SVM-RFE method.