Paper Details
Subject:
Paper ID: UIJRTV6I90012
Volume:06
Issue:09
Pages:117-126
Date:July 2025
ISSN:2582-6832
Statistics:

Loading

  Full Text [PDF]

Cite this
Johanes Eka Priyatma and Mikael Raditya Agung Sasmita, 2025. Comparative Analysis of Random Forest and Support Vector Machine for Classifying Pima Indians Diabetes Dataset. United International Journal for Research & Technology (UIJRT). 6(9), p117-126.
Abstract
This study explored how well two machine learning algorithms—Random Forest (RF) and Support Vector Machine (SVM)—performed in classifying the Pima Indians Diabetes Dataset, which is used to predict the likelihood of individuals developing diabetes. To ensure a fair and reliable comparison, both models were evaluated using 10-fold cross-validation. Their effectiveness was measured through key classification metrics: accuracy, precision, recall, and F1-score. The results highlighted Random Forest as the more stable and reliable model, achieving an average accuracy of 76.3% and consistently strong results across all folds. In contrast, while the SVM with a polynomial kernel delivered slightly better precision (74.57%), it fell short in terms of overall accuracy, recall, and F1-score when compared to Random Forest. Ultimately, Random Forest proved to be better at identifying true positive cases and handling variations in the data, making it a stronger candidate for classifying health-related datasets like this one. That said, with further tuning of its parameters, SVM still holds promise as a competitive alternative.

Keywords: Random Forest, Support Vector Machine, Diabetes Classification, Pima Dataset, Machine Learning.


Related Papers

Close Menu