These days, Speech understanding and dialog systems are a global trend and with the advancement of artificial intelligence and deep learning technic, has drawn attention from both academic and business communities. Domain prediction, intent detection and entity extraction or slot filling are the most important parts for intelligent conversational systems. To predict domain, intent and entity various traditional machine learning algorithm such as Bayesian algorithm, Support Vector Machine, Artificial Neural Network along with recent Deep Neural Network technic are used. Most dialog systems process user input in a step-wise or pipelined manner; i.e. first domain is detected, then intent and entity are inferred based on the semantic frames of the detected domain. The pipeline approach, however, has many disadvantages; such as downstream error propagation, creation of multiple predictive models, large user annotated datasets for each domain and lack of information sharing between domain, intent and entity.
To address those issues, this study propose a jointly predictive single deep neural network (DNNs) framework based on Long Short Term Memory (LSTM) with only small user annotated datasets and also investigates the value added by incorporating unlabeled data from user chatting logs into multi domain spoken language understanding (SLU) or Natural Language Understanding (NLU) systems.
Further, it extends the previous literature by benchmarking LSTM against current best practices for domain, intent, and entity detection and also extends the previous literature on semi-supervised learning and its application and implementation in the field of NLU and SLU. Using multi domain datasets and open conversational system datasets in English language, validates a framework that explains and discusses how unlabeled data from various data sources can be incorporated in a jointly trained multi-domain NLU Model to improve model performance with small user annotated examples.
The systematic experimental analysis with open annotated and unannotated utterances for proposed joint frameworks: semi supervised multi domain model (SEMI-MDJM), multi domain joint model with adversarial learning (MDJM-ADV) and semi supervised multi domain adversarial learning (SEMI-MDJM-ADV) models show the improvement in the predictive performance of multi domain joint base model (MDJM). The SEMI-MDJM-ADV and MDJM-ADV models show significant improvement in the predictive performance of domain, intent, and entity of the multi domain joint model in comparison to MDJM.