KNN Enhancement with Chi-Square and Manhattan Distance Formula Applied to Fake Website Detection

PAPER DETAILS

CITE THIS

Ivan C. Capili, Richard Vincent B. Ferrer, Vivien A. Agustin, Jonathan C. Morano, and Mark Christopher R. Blanco, 2023. KNN Enhancement with Chi-Square and Manhattan Distance Formula Applied to Fake Website Detection. United International Journal for Research & Technology (UIJRT), 4(7), pp318-323.

Abstract

With the increase in data usage, measures are taken to combat the associated problems. One problem is the “Curse of Dimensionality,” which refers to the problem related to high dimensional data. K-Nearest Neighbor is a widely used machine learning algorithm, and this study aims to improve performance by implementing Chi-Square as an attribute reduction process and Manhattan Distance Formula for the nearest neighbor search in the KNN process. Cross-validation was added to the algorithm to help in K value selection. The algorithm was tested on a dataset containing phishing and legitimate websites and its attributes. The results showed that the enhancements added effectively improved the algorithm’s performance scores in terms of high dimensional data.

Keywords: Chi-Square, Cross-validation, Curse of Dimensionality, High Dimensional Data, K-Nearest Neighbor, Manhattan Distance Formula.

Related Papers

For Conference & Paper Publication​