Enhancing Spam Comment Detection on Social Media With Emoji Feature and Post-Comment Pairs Approach Using Ensemble Methods of Machine Learning

Chrismanto, Antonius Rachmat and Sari, Anny Kartika and Suyanto, Yohanes (2023) Enhancing Spam Comment Detection on Social Media With Emoji Feature and Post-Comment Pairs Approach Using Ensemble Methods of Machine Learning. IEEE Access, 11. pp. 80246-80265. ISSN 21693536

[thumbnail of Enhancing_Spam_Comment_Detection_on_Social_Media_With_Emoji_Feature_and_Post-Comment_Pairs_Approach_Using_Ensemble_Methods_of_Machine_Learning.pdf] Text
Enhancing_Spam_Comment_Detection_on_Social_Media_With_Emoji_Feature_and_Post-Comment_Pairs_Approach_Using_Ensemble_Methods_of_Machine_Learning.pdf
Restricted to Registered users only

Download (5MB)

Abstract

Every time a well-known public figure posts something on social media, it encourages many users to comment. Unfortunately, not all comments are relevant to the post. Some are spam comments which can disrupt the overall flow of information. This research employed two strategies to address issues in text spam detection on social media. The first strategy was utilizing emojis that had been frequently discarded in many studies. In fact, many social media users use emojis to convey their intentions. The second strategy was utilizing stacked post-comment pairs, which was different from many spam detection systems that solely focused on comment-only data. The post-comment pairs were required to detect whether a comment was relevant (not spam) or spam irrelevant to the post context. This research used the SpamID-Pair dataset derived from social media for Indonesian spam comment detection. After a comprehensive investigation, the emoji-text feature, the stacked post-comment pairs, and ensemble voting could boost detection performance (in terms of accuracy and F1). Adding manual features also improved detection performance. Based on the experiment, the best stand-alone methods for spam comment detection are the SVM (RBF kernel) and the soft voting ensemble method for the best average performance.

Item Type: Article
Uncontrolled Keywords: Spam detection,emoji feature,ensemble method,post-comment pair,social media
Subjects: Q Science > Q Science (General)
Depositing User: Rita Yulianti Yulianti
Date Deposited: 17 Apr 2024 04:15
Last Modified: 17 Apr 2024 04:15
URI: https://ir.lib.ugm.ac.id/id/eprint/466

Actions (login required)

View Item
View Item