Design and Development of a machine learning system for Personality Classification based on stylometric features

Sergio López López. (2020). Design and Development of a machine learning system for Personality Classification based on stylometric features. Final Career Project (TFG). Universidad Politécnica de Madrid, ETSI Telecomunicación.

During the last few years, the use of digital devices with Internet access, such as tablets or smartphones, has been increasing considerably. This has led to an increase in the use of the Internet and social networks, especially among the younger population. Social networks are platforms that allow the exchange of information between people and promote the dissemination of personal data and content which can be used for sociological studies of companies. In this way, the characterization of the users’ personality through their activity can be carried out. The knowledge of an user’s personality is something complicated to obtain but its useful- ness is more than proven. Personality determines how a person can behave, what preferences he has and what attitudes and aptitudes he shows. In short, it determines an individual’s characteristic thoughts and behavior. This has a great importance to companies, for exam- ple, when recruiting new employees, as personality has been shown to influence subsequent job performance. This end-of-degree work arises from the combination of social networks and the person- ality of the users. Thus, the objective is to predict the personality of people based on data collected from one of the most popular social networks, Twitter. This will be achieved by using Machine Learning and Big Data techniques. To develop it, four different classifiers will be created which, combined with each other, will provide the users’ personality profiles automatically and easily from their Twitter profiles. In the process, different features will be extracted using supervised machine learning tools and natural language processing (NLP) techniques. In the last phase, a web application will be developed which will allow knowing the user’s personality thanks to the trained models which have been developed. In this way, the human resources departments of the companies will be able to improve their decision making on the potential of the applicants without the need to carry out surveys or spend more time than necessary.