Survey of Available Datasets for Designing Task Oriented Dialogue Agents

Nitin Namdeo Pise

doi:10.1109/ICMRSISIIT46373.2020.9405898

Dialogue Systems are increasingly popular with the recent advances in neural approaches and NLP applied to conversational AI. Alexa, Siri, Cortana, Google Mini are handily used by many users to do small tasks and control their home appliances in hands free style. Enterprises are also deploying 24 × 7 dialogue agent in place of traditional customer support to increase user engagement and improve their processes. Dialogue Systems are also augmented with Robots to improve human-robot dialogues.Conversational Agents are classified into two main types: Social bots/Chitchat bots and Task Oriented Dialogue Agents. Social bots aim to engage user with unstructured human conversations. These dialogue agents don't have fixed aim to complete and focus more on carrying out open domain conversations. For example ALIZA, Microsoft XiaoIce etcOn the other hand, Task oriented dialogue agents help user to accomplish certain tasks in specific domains like Restaurant booking, Flight reservation, customer support etc. These are popularly used in controlling home appliances and carrying out simple tasks by users in day to day life. Siri, Alexa, Google Mini, Cortana are task oriented dialogue agents. There is increasing interest in building task completion dialogue agents that span over multiple sub-domains to accomplish a complex user goal.With the increasing acceptance of Dialogue Agents, there is need of high-quality, large-scale dialogue datasets for better performance of task oriented dialogue agent in changing environment. Neural approaches are applied to design intelligent dialogue agents frequently which require very large datasets. However, there are following challenges while building intelligent task completion dialogue systems. Firstly, there are a lot of datasets available for chit-chat bots but they are not directly relevant to task oriented systems. Secondly, to scale out the system to new domains with limited in-domain data.In this paper, we studied different data collection methods, important characteristics of dialogue datasets and their potential uses. This paper presents a survey of publicly available datasets and their applicability for designing modern task-oriented dialogue agents. © 2019 IEEE.

Journal	Data powered by SciSpaceInternational Conference on Mechatronics, Remote Sensing, Information Systems, and Industrial Information Technologies, ICMRSISIIT 2019
Publisher	Data powered by SciSpaceInstitute of Electrical and Electronics Engineers Inc.
Open Access	No