The appearance of new Deep Learning applications for Sentiment Analysis has motivated a lot of researchers, mainly because of their automatic feature extraction and representation capabilities, as well as their better performance compared to the previous feature based techniques. These traditional surface approaches are based on complex manually extracted features, and this extraction process is a fundamental question in feature driven methods. However, these long-established approaches can yield strong baselines on their own, and its predictive capabilities can be used in conjunction with the arising Deep Learning methods. In this paper we seek to improve the performance of these new Deep Learning techniques integrating them with more traditional surface approaches based on manually extracted features. The contributions of this paper are: first, we develop a Deep Learning based Sentiment classifier using the Word2Vec model and a linear machine learning algorithm. This classifier serves us as a baseline with which we can compare subsequent results. Second, we propose two ensemble techniques which aggregate our baseline classifier with other surface classifiers widely used in the field of Sentiment Analysis. Third, we also propose two models for combining deep features with both surface and deep features in order to merge the information from several sources. As fourth contribution, we introduce a taxonomy for classifying the different models we propose, as well as the ones found in the literature. Fifth, we conduct several reproducible experiments with the aim of comparing the performance of these models with the Deep Learning baseline. For this, we employ four public datasets that were extracted from the microblogging domain. Finally, as a result, the experiments confirm that the performance of these proposed models surpasses that of our original baseline
using as metric the F1-Score, with improvements ranging from 0.21 to 3.62 %.