LABOR: Learning and Building on Reviews
(HR Analytics)
Hybrid NLP Models for Sentiment Prediction
Abstract
Employee management is one of the most important functions within an organization. Recent studies have shown that employee perceptions of culture, current management, opportunities for growth, and other intangible factors are correlated to a company’s financial well-being. Therefore, it is very important that managers are able to develop necessary skills to effectively connect with employees. It has also given a new dimension to human resource management from being reactionary (i.e. solving employee complaints, etc.) to a more proactive role (giving insights that help in creating policies that prevent or minimize employee dissatisfaction). In this paper, we explore models that predict employee sentiment based on text gathered in employee review data from Glassdoor.com. With the models we have developed, which accurately predict employee sentiment and give insights on what push these ratings, we now are able to provide organizations a new way of better understanding their employees, via internal quarterly reviews or through employee comments in their in-house networks.
Methodology
This study is an exploration of the possibilities of creating hybrid NLP models for predictions: using both word embeddings and other numerical features to make a classification prediction. The main reason this is not possible without neural networks (specifically embedding models) is because conventional bag of words creates sparse vector representations of words. This sparse representation makes it such that adding other features to the matrix would have little to no impact.
To achieve this, the vector representation of the numerical features is added to the word vector created from word embeddings. The numerical features are probabilities resulting from topic modeling through LDA. A neural network is then trained to make predictions on whether a job review is negative, neutral, or positive.
The implementation of our models is based on two different libraries in Python. The topic modeling through LDA was done through the gensim library, which features extensive functionality for more updated NLP methods. Meanwhile, word Embedding and the Stacked GRU were implemented through Tensorflow Keras.
AIM MSDS Deep Learning Project Presentation (December 12, 2019)


Learning Team 4
(First picture: from left to right) Crisanto E. Chua, Armand Louis A. De Leon, Jeddahlyn V. Gacera, Ria Ysabelle L. Flora