KNN Enhancement with Chi-Square and Manhattan Distance Formula Applied to Fake Website Detection
- Author(s): Ivan C. Capili, Richard Vincent B. Ferrer, Vivien A. Agustin, Jonathan C. Morano, and Mark Christopher R. Blanco
PAPER DETAILS
- Computer Science and Engineering
-
Paper ID: UIJRTV4I70037
-
Volume: 04
-
Issue: 07
-
Pages: 318-323
-
May 2023
-
ISSN: 2582-6832
-
CITE THIS
Abstract
With the increase in data usage, measures are taken to combat the associated problems. One problem is the “Curse of Dimensionality,” which refers to the problem related to high dimensional data. K-Nearest Neighbor is a widely used machine learning algorithm, and this study aims to improve performance by implementing Chi-Square as an attribute reduction process and Manhattan Distance Formula for the nearest neighbor search in the KNN process. Cross-validation was added to the algorithm to help in K value selection. The algorithm was tested on a dataset containing phishing and legitimate websites and its attributes. The results showed that the enhancements added effectively improved the algorithm’s performance scores in terms of high dimensional data.