Opinion Analysis and Machine Learning Modeling for Depression Detection
- Author(s): Adebisi A. Baale, Olawumi R. Olasunkanmi, Felicia E. Adelodun and Adeniyi A. Adigun
PAPER DETAILS
- Computer Science and Engineering
-
Paper ID: UIJRTV2I40005
-
Volume: 02
-
Issue: 04
-
Pages: 38-43
-
February 2021
-
ISSN: 2582-6832
-
CITE THIS
Abstract
Many people express opinions on social media sites when they suffer from mental disorders like depression, anxiety, and tension due to pressures, external environment, and other reasons. Such posts shared via Twitter, Facebook, and Instagram are used to identify a person’s state of mind. The situation ideation, which is rarely noticed on time until after a tragic consequence, are often earlier expressed overtly or covertly in social media posts. As a result, this research aimed at implementing four (4) classifiers- Logistic Regression (LR), Naive Bayes (NB), Random Forest (RF), and Decision Tree (DT) on two text-feature extraction techniques- Term Frequency- Inverse Document Frequency (TF-IDF) and Bag of Words (BOW). We split the Sentiment140 downloaded dataset from Kaggle into 75%, 25% training, and testing data to predict mental health depression in the tweet’s dataset. TF-IDF models produced the highest accuracy with DT (99%) and RF (99%), while the BOW extends the same performance with LR (99%). However, to mitigate the challenges of erroneous classification of depressive individuals as neutral, Receiver Operating Characteristic / Area Under Curve (ROC_AUC) scores of classifiers used was obtained. At the same time, the RF and DT produced 99%, the highest ROC_AUC score. Overall performance of models revealed that tree-based models performed better on the test data used in this research to classify and predict mental health depression in the tweet’s dataset.