Improving Conversation Modelling using Attention Based Variational Hierarchical RNN

Authors

  • Sandeep A. Thorat
  • Komal P. Jadhav

DOI:

https://doi.org/10.47839/ijc.20.1.2090

Keywords:

chatbot, conversation agent, response generation, variational hierarchical RNN, deep learning, natural language processing, attention mechanism

Abstract

Conversation modeling is one of most important applications of natural language processing. Building response generation model for open domain conversation in a Chatbot is one of the hardest challenges in this area. The deep neural network architectures such as sequence to sequence models and its hierarchical variants provide a significant improvement in the field of conversation modeling. Although these models require large size corpus, they may cause huge data loss in training phase. Also, these models are unable to concentrate on important data in given context. It affects on generation of responses. To tackle these issues, this research work proposes a Variational Hierarchical Conversation RNN with Attention mechanism (VHCRA) model for response generation. The VHCRA uses the concept of latent variable representation to avoid data degeneracy and the attention mechanism to identify important data within context. The model is trained on large size benchmark dataset, i.e., Cornell Movie Dialog corpus which contains conversations from different movies. The model is evaluated using automatic evaluation metrics such as Negative Log-likelihood and Embedding-Based Metrics. The experimental result shows that the proposed model gains significant improvement in comparison with recently proposed approaches and generate meaningful responses according to the context.

References

I. Sutskever, “Sequence to sequence learning with neural networks,” Adv. Neural Inf. Process. Syst., pp. 3104–3112, 2014.

I. V. Serban, A. Sordoni, R. Lowe, L. Charlin, and J. Pineau, “A hierarchical latent variable encoder-decoder model for generating dialogues,” Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI'17), February 2017, pp. 3295–3301.

I. V Serban, A. Sordoni, Y. Bengio, A. Courville, and J. Pineau, “Building end-to-end dialogue systems using generative hierarchical neural network models,” Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2014, pp. 3776–3783.

A. Sordoni, M. Galley, M. Auli, and C. Brockett, “A neural network approach to context-sensitive generation of conversational responses,” Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2015, pp. 196–205. https://doi.org/10.3115/v1/N15-1020.

Y. Park, J. Cho, G. Kim, “A hierarchical latent structure for variational conversation modeling,” arXiv:1804.03424v2, 2018. https://doi.org/10.18653/v1/N18-1162.

D. P. Kingma, D. J. Rezende, S. Mohamed, and M. Welling, “Semi-supervised learning with deep generative models,” arXiv:1406.5298v2, pp. 1–9, 2009.

H. Bahuleyan, L. Mou, O. Vechtomova, and P. Poupart, “Variational attention for sequence-to-sequence models,”arXiv preprint arXiv:1712.08207, 2017.

M. Zhou, C. Xing, Y. Wu, W. Wu, Y. Huang, “Hierarchical recurrent attention network for response generation,” Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence (AAAI'18), 2018, pp. 5610–5617.

D. Bahdanau, K. Cho, and Y. Bengio, “Neural machine translation by jointly learning to align and translate,” arXiv preprint arXiv:1409.0473, pp. 1–15, 2015.

M. Luong and C. D. Manning, “Effective approaches to attention-based neural machine translation,” Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2015, pp. 1412–1421. https://doi.org/10.18653/v1/D15-1166.

J. Weizenbaum, “ELIZA – A computer program for the study of natural language communication between man and machine,” Commun. ACM, vol. 9, no. 1, pp. 36–45, 1966. https://doi.org/10.1145/365153.365168.

B. Abushawar and E. Atwell, “ALICE chatbot : Trials and outputs,” Computación y Sistemas, vol. 19, no. 4, pp. 625–632, 2015. https://doi.org/10.13053/cys-19-4-2326.

H. Chen, X. Liu, D. Yin, and J. Tang, “A survey on dialogue systems: Recent advances and new frontiers,” ACM SIGKDD Explor. Newsl., vol. 19, no. 2, pp. 25–35, 2017. https://doi.org/10.1145/3166054.3166058.

X. Shen and H. Su, “A conditional variational framework for dialog generation,” Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2016, pp. 504–509. https://doi.org/10.18653/v1/P17-2080.

C. Xing, W. Wu, Y. Wu, and J. Liu, “Topic aware neural response generation,” Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2016, pp. 3351–3357.

H. Zheng, W. E. I. Wang, W. Chen, and A. K. Sangaiah, “Automatic generation of news comments based on gated attention neural networks,” IEEE Access, vol. 6, pp. 702–710, 2018. https://doi.org/10.1109/ACCESS.2017.2774839.

L. Shang, Z. Lu, and H. Li, “Neural responding machine for short-text conversation,” Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, 2015, pp. 1577–1586. https://doi.org/10.3115/v1/P15-1152.

J.He, B.Wang, M. Fu, “Hierarchical attention and knowledge matching networks with information enhancement for end-to-end task-oriented dialog systems,” IEEE Access, vol. 7, pp. 18871–18883, 2019. ttps://doi.org/10.1109/ACCESS.2019.2892730.

C. Mellon and U. C. Berkeley, “Tutorial on variational autoencoders,” arXiv Prepr. arXiv1606.05908, pp. 1–23, 2016.

T. Young, D. Hazarika, S. Poria, and E. Cambria, “Recent trends in deep learning based natural language processing,” IEEE Comput. Intell. Mag., vol. 13, no. 3, pp. 55–75, 2018. https://doi.org/10.1109/MCI.2018.2840738.

Y. Deng, Y. Kim, and A. M. Rush, “Latent alignment and variational attention,” Proceedings of the 32nd Conference on Neural Information Processing Systems (NeurIPS), 2018, pp. 9735–9747.

X. Shen, H. Su, S. Niu, and V. Demberg, “Improving variational encoder-decoders in dialogue generation,” Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18), 2018, pp. 5456–5463.

X. Shen and H. Su, “Towards Better Variational Encoder-Decoders in Seq2Seq Tasks,” Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18) Towards, 2018, no. 1, pp. 8155–8156.

C. Danescu-Niculescu-Mizil and L. Lee, “Chameleons in imagined conversations : A new approach to understanding coordination of linguistic style in dialogs,” Proceedings of the 2nd Workshop on Cognitive Modeling and Computational Linguistics, 2011, pp. 76-87.

C. Liu, R. Lowe, I. V Serban, M. Noseworthy, L. Charlin, and J. Pineau, “How not to evaluate your dialogue system: An empirical study of unsupervised evaluation metrics for dialogue response generation,” Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2006, pp. 2122–2132. https://doi.org/10.18653/v1/D16-1230.

Z. Xie, “Neural text generation: A practical guide,” in arXiv preprint arXiv:1711.09534, 2018, pp. 1–21.

Downloads

Published

2021-03-29

How to Cite

Thorat, S. A., & Jadhav, K. P. (2021). Improving Conversation Modelling using Attention Based Variational Hierarchical RNN. International Journal of Computing, 20(1), 39-45. https://doi.org/10.47839/ijc.20.1.2090

Issue

Section

Articles