Publication - An Ensemble Method for Radicalization and Hate Speech Detection Online Empowered by Sentic Computing

An Ensemble Method for Radicalization and Hate Speech Detection Online Empowered by Sentic Computing

Oscar Araque & Carlos A. Iglesias. (2022). An Ensemble Method for Radicalization and Hate Speech Detection Online Empowered by Sentic Computing. Cognitive Computation, 14, 48-61.

Abstract:

The dramatic growth of the Web has motivated researchers to extract knowledge from enormous repositories and to exploit the knowledge in myriad applications. In this study, we focus on natural language processing (NLP) and, more concretely, the emerging field of affective computing to explore the automation of understanding human emotions from texts. This paper continues previous efforts to utilize and adapt affective techniques into different areas to gain new insights. This paper proposes two novel feature extraction methods that use the previous sentic computing resources AffectiveSpace and SenticNet. These methods are efficient approaches for extracting affect-aware representations from text. In addition, this paper presents a machine learning framework using an ensemble of different features to improve the overall classification performance. Following the description of this approach, we also study the effects of known feature extraction methods such as TF-IDF and SIMilarity-based sentiment projectiON (SIMON). We perform a thorough evaluation of the proposed features across five different datasets that cover radicalization and hate speech detection tasks. To compare the different approaches fairly, we conducted a statistical test that ranks the studied methods. The obtained results indicate that combining affect-aware features with the studied textual representations effectively improves performance. We also propose a criterion considering both classification performance and computational complexity to select among the different methods.

JRESEARCH_BIBTEX:

@article{an-gsi-article-2021,
author = "Araque, Oscar and Iglesias, Carlos A.",
abstract = "The dramatic growth of the Web has motivated researchers to extract knowledge from enormous repositories and to exploit the knowledge in myriad applications. In this study, we focus on natural language processing (NLP) and, more concretely, the emerging field of affective computing to explore the automation of understanding human emotions from texts. This paper continues previous efforts to utilize and adapt affective techniques into different areas to gain new insights. This paper proposes two novel feature extraction methods that use the previous sentic computing resources AffectiveSpace and SenticNet. These methods are efficient approaches for extracting affect-aware representations from text. In addition, this paper presents a machine learning framework using an ensemble of different features to improve the overall classification performance. Following the description of this approach, we also study the effects of known feature extraction methods such as TF-IDF and SIMilarity-based sentiment projectiON (SIMON). We perform a thorough evaluation of the proposed features across five different datasets that cover radicalization and hate speech detection tasks. To compare the different approaches fairly, we conducted a statistical test that ranks the studied methods. The obtained results indicate that combining affect-aware features with the studied textual representations effectively improves performance. We also propose a criterion considering both classification performance and computational complexity to select among the different methods.",
comments = "JCR 2019 Q1 4.307,
SJR 2020 Q1 0.83,
Scopus 2020 Q1 8.6",
doi = "https://doi.org/10.1007/s12559-021-09845-6",
issn = "1866-9956",
journal = "Cognitive Computation",
month = "February",
pages = "48-61",
title = "{A}n {E}nsemble {M}ethod for {R}adicalization and {H}ate {S}peech {D}etection {O}nline {E}mpowered by {S}entic {C}omputing ",
url = "https://link.springer.com/article/10.1007/s12559-021-09845-6",
volume = "14",
year = "2022",
}

JCR 2019 Q1 4.307, SJR 2020 Q1 0.83, Scopus 2020 Q1 8.6