Afaan Oromo Fake News Detection Using Natural Language Processing and Passive-Aggressive
The main objective of this study is to develop Afaan Oromo fake news detection system. The designed system involves preprocessing like tokenization, Normalization, stop word removing and abbreviation resolving, feature extraction-like Term-Frequency-inverted document frequency, term frequency, and hash to know word importance that appears in the news and word appears in the corpus and N-grams which are a powerful Natural Language Processing technique in order to capture semantic and syntactic sequences was also used. All possible combination of features extraction techniques and natural processing techniques were used with a passive-aggressive classification algorithm. Passive-Aggressive performs 97.2% with a classification error of 2.8% which was better than ensemble algorithms like gradient boosting and random forest and linear classifier like multinomial Naïve Bayes. Finally, a python Django was used for the web-based deployment of the model system using the Term Frequency-Inverted Document Frequency feature extraction with unigram and Passive aggressive classification algorithm.