Design of a Prototype of a Big Data Analysis System of Online Radicalism based on Semantic and Deep Learning technologies

Rodrigo Barbado. (2018). Design of a Prototype of a Big Data Analysis System of Online Radicalism based on Semantic and Deep Learning technologies. Final Career Project (TFM). Universidad Politécnica de Madrid, ETSIT, Madrid.

The rise in recent years of Islamic terrorism in the West has forced states to take preventive measures in order to avoid or minimize new tragedies. This boom is accompanied in turn by a new form of promotion of terrorist organizations, which use the Internet as a means of spreading propaganda to get new followers around the world. A clear example of this fact occurred in the attack of Barcelona in August of 2017, in which it was discovered that terrorists who participated in the attack had previously shared hateful messages on their social networks. This work is part of the Trivalent project (Terrorism pReventIon Via rAdicaLisation countEr-NarraTive), which aims to give a preventive solution to the problem described by studying the narratives used by terrorist organizations in order to develop effective counter- narratives to prevent the processes of radicalization. This thesis is divided in two parts. The first describes a system for monitoring Internet media such as newspaper websites, social networks or jihadist propaganda magazines present on the Internet. This system consists of three parts fundamentally: a first step consisting in the data intake, then a process of analyzing said data and enriching it by semantic means following Linked Data guidelines, and finally the storage of the extracted information accompanied of a visualization layer. It also offers the possibility of annotating narratives about extracted texts in order to carry out other operations such as automatic detection of narratives. The second part consists in the creation of a classification system of tweets according to whether they have a radicalism component or not. For the development of the classifier, Natural Language Processing and Deep Learning technologies have been used, and the algorithm has been developed with the Keras library acting over TensorFlow.