Abstract
We address the problem that different users have different lexical knowledge about problem domains, so that automated dialogue systems need to adapt their generation choices online to the users’ domain knowledge as it encounters them. We approach this problem using Reinforcement Learning in Markov Decision Processes (MDP). We present a reinforcement learning framework to learn adaptive referring expression generation (REG) policies that can adapt dynamically to users with different domain knowledge levels. In contrast to related work we also propose a new statistical user model which incorporates the lexical knowledge of different users. We evaluate this framework by showing that it allows us to learn dialogue policies that automatically adapt their choice of referring expressions online to different users, and that these policies are significantly better than hand-coded adaptive policies for this problem. The learned policies are consistently between 2 and 8 turns shorter than a range of different hand-coded but adaptive baseline REG policies.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bell, A.: Language style as audience design. Language in Society 13(2), 145–204 (1984)
Belz, A., Varges, S.: Generation of repeated references to discourse entities. In: Proc. ENLG 2007 (2007)
Boye, J.: Dialogue management for automatic troubleshooting and other problem-solving applications. In: Proc. SIGDial 2007 (2007)
Branigan, H.P., Pickering, M.J., Pearson, J., McLean, J.F.: Linguistic alignment between people and computers. Journal of Pragmatics (in Press)
Brennan, S.E.: Conversation with and through computers. User Modeling and User-Adaptive Interaction 1(1), 67–86 (1991)
Bromme, R., Jucks, R., Wagner, T.: How to refer to diabetes? Language in online health advice. Applied Cognitive Psychology 19, 569–586 (2005)
Buschmeier, H., Bergmann, K., Kopp, S.: Modelling and evaluation of lexical and syntactic alignment with a priming-based microplanner. In: Krahmer, E., Theune, M. (eds.) Empirical Methods in NLG. LNCS (LNAI), vol. 5790, pp. 85–104. Springer, Heidelberg (2010)
Clark, H.H.: Using Language. Cambridge University Press, Cambridge (1996)
Clark, H.H., Murphy, G.L.: Audience design in meaning and reference. In: Le Ny, J.F., Kintsch, W. (eds.) Language and comprehension. North-Holland Publishing Company, Amsterdam (1982)
Dale, R.: Cooking up referring expressions. In: Proc. ACL 1989(1989)
van Deemter, K.: Generating referring expressions: Boolean extensions of the Incremental Algorithm. Computational Linguistics 28(1), 37–52 (2002)
van Deemter, K.: What game theory can do for NLG: the case of vague language. In: Proc. ENLG 2009 (2009)
Gatt, A., Belz, A.: Attribute selection for referring expression generation: New algorithms and evaluation methods. In: Proc. INLG 2008 (2008)
Georgila, K., Henderson, J., Lemon, O.: Learning user simulations for information state update dialogue systems. In: Proc. Eurospeech/Interspeech (2005)
Heller, D., Skovbroten, K., Tanenhaus, M.K.: Experimental evidence for speakers sensitivity to common vs. privileged ground in the production of names. In: Proc. PRE-CogSci 2009 (2009)
Hinds, P.: The curse of expertise: The effects of expertise and debiasing methods on predictions of novice performance. Experimental Psychology: Applied 5(2), 205–221 (1999)
Issacs, E.A., Clark, H.H.: References in conversations between experts and novices. Journal of Experimental Psychology: General 116, 26–37 (1987)
Janarthanam, S., Lemon, O.: User simulations for online adaptation and knowledge-alignment in troubleshooting dialogue systems. In: Proc. SEMdial 2008 (2008)
Janarthanam, S., Lemon, O.: A two-tier user simulation model for reinforcement learning of adaptive referring expression generation policies. In: Proc. SIGDial 2009 (2009)
Janarthanam, S., Lemon, O.: A wizard-of-oz environment to study referring expression generation in a situated spoken dialogue task. In: Proc. ENLG 2009 (2009)
Komatani, K., Ueno, S., Kawahara, T., Okuno, H.G.: Flexible guidance generation using user model in spoken dialogue systems. In: Proc. ACL 2003 (2003)
Komatani, K., Ueno, S., Kawahara, T., Okuno, H.G.: User modeling in spoken dialogue systems to generate flexible guidance. User Modeling and User-Adapted Interaction 15(1), 169–183 (2005)
Krahmer, E., van Erk, S., Verleg, A.: Graph-based generation of referring expressions. Computational Linguistics 29(1), 53–72 (2003)
Lemon, O.: Adaptive natural language generation in dialogue using reinforcement learning. In: Proc. SEMdial 2008 (2008)
Levin, E., Pieraccini, R., Eckert, W.: Learning dialogue strategies within the markov decision process framework. In: Proc. ASRU 1997 (1997)
McKeown, K., Robin, J., Tanenblatt, M.: Tailoring lexical choice to the user’s vocabulary in multimedia explanation generation. In: Proc. ACL 1993 (1993)
Molich, R., Nielsen, J.: Improving a human-computer dialogue. Communications of the ACM 33(3), 338–348 (1990)
Pickering, M.J., Garrod, S.: Toward a mechanistic psychology of dialogue. Behavioral and Brain Sciences 27, 169–225 (2004)
Porzel, R., Scheffler, A., Malaka, R.: How entrainment increases dialogical efficiency. In: Proc. Workshop on Effective Multimodal Dialogue Interfaces, Sydney (2006)
Reiter, E., Dale, R.: Computational interpretations of the Gricean maxims in the generation of referring expressions. Cognitive Science 18, 233–263 (1995)
Rieser, V., Lemon, O.: Natural language generation as planning under uncertainty for spoken dialogue systems. In: Proc. EACL 2009 (2009)
Rieser, V., Lemon, O.: Learning effective multimodal dialogue strategies from wizard-of-oz data: Bootstrapping and evaluation. In: Proc. ACL 2008 (2008)
Rieser, V., Lemon, O.: Natural Language Generation as Planning Under Uncertainty for Spoken Dialogue Systems. In: Krahmer, E., Theune, M. (eds.) Empirical Methods in NLG. LNCS (LNAI), vol. 5790, pp. 105–120. Springer, Heidelberg (2010)
Schatzmann, J., Thomson, B., Weilhammer, K., Ye, H., Young, S.J.: Agenda-based user simulation for bootstrapping a POMDP dialogue system. In: Proc. HLT/NAACL 2007 (2007)
Schatzmann, J., Weilhammer, K., Stuttle, M.N., Young, S.J.: A survey of statistical user simulation techniques for reinforcement learning of dialogue management strategies. Knowledge Engineering Review, 97–126 (2006)
Schlangen, D.: Causes and strategies for requesting clarification in dialogue. In: Proc. SIGDial 2004 (2004)
Shapiro, D., Langley, P.: Separating skills from preference: Using learning to program by reward. In: Proc. ICML 2002 (2002)
Siddharthan, A., Copestake, A.: Generating referring expressions in open domains. In: Proc. ACL 2004 (2004)
Stoia, L., Shockley, D.M., Byron, D.K., Fosler-Lussier, E.: Noun phrase generation for situated dialogs. In: Proc. INLG 2006, pp. 81–88 (July 2006)
Sutton, R., Barto, A.: Reinforcement Learning. MIT Press, Cambridge (1998)
Williams, J.: Applying POMDPs to dialog systems in the troubleshooting domain. In: Proc. HLT/NAACL Workshop on Bridging the Gap: Academic and Industrial Research in Dialog Technology (2007)
Wittwer, J., Nckles, M., Renkl, A.: What happens when experts over- or underestimate a laypersons knowledge in communication? Effects on learning and question asking. In: Proc. CogSci 2005 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Janarthanam, S., Lemon, O. (2010). Learning Adaptive Referring Expression Generation Policies for Spoken Dialogue Systems. In: Krahmer, E., Theune, M. (eds) Empirical Methods in Natural Language Generation. EACL ENLG 2009 2009. Lecture Notes in Computer Science(), vol 5790. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15573-4_4
Download citation
DOI: https://doi.org/10.1007/978-3-642-15573-4_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15572-7
Online ISBN: 978-3-642-15573-4
eBook Packages: Computer ScienceComputer Science (R0)