Adaptive Dialogue Coordination for Sociable Agents
DiaCoSA explores how speakers coordinate their interaction in dialogue by providing communicative feedback and by adapting their utterances to each other's. Building on empirical analyses of human dialogues, a new type of artificial ‘attentive communicator’ is developed in the form of a virtual conversational agent that actively tries to pursue understanding by being sensitive to the user's feedback, possibly even eliciting it, and by being able to adapt its lexico-grammatical choices to the interlocutor's.
A major prerequisite for trouble-free and efficient interaction in dialogue is the ability of interlocutors to coordinate their communicative actions. One basic mechanism for this is communicative feedback: short and unobtrusive vocal signals (e.g., ‘mhm’, ‘okay’) or nonverbal behaviours (e.g., head nods, frown, gaze) with which listeners continuously provide information back to the speaker. Another pertinent mechanism is the adaptation of one's utterances to the words and constructions used by either interlocutor before. Coordinations of these kinds and a good understanding of their combined use should also be integral in systems that ought to interact in natural language with their users, e.g., dialogue systems, conversational virtual agents or social robots. Current systems, however, lack the ability to recognise feedback signals from their users and to respond by flexibly adapting their communicative actions. DiaCoSA investigates (1) how and when humans use feedback in natural dialogue; (2) how machines as a new kind of ‘attentive communicators’ can draw upon their users' feedback to keep track of and react to their state of understanding; (3) what the role of lexico-grammatical adaptions is in this; and (4) how such systems can advance spoken language interaction between humans and machines. Starting with an empirical study on human feedback use (recorded cooperative dialogues in a lab situation), vocal and nonverbal feedback signals as well as other relevant actions are annotated and analysed with respect to their occurrence, form, function, and context. Based on these data, Bayesian decision networks as probabilistic graphical models are built to approximate the dependencies and causations in feedback use. An LTAG-based sentence planner is built that plans utterances incrementally and takes into account priming-based activations in its lexico-grammatical decisions. The models are implemented and put to use in the highly expressive virtual character ‘BILLIE’ (Bielefeld Life-Like Interactive agEnt), equipped with speech recognition, head/gaze tracking, and facial feature tracking technology. This system is applied in the domain of a personal calendar assistant and evaluated in face-to-face interactions with human users.
The main expected outcome of the DiaCoSA project is a new kind of artificial ‘attentive communicator’ that keeps track of and actively strives to ensure its user's understanding. This includes abilities to recognize feedback even while uttering itself, to actively elicit feedback when necessary, and to respond to it by adapting its subsequent communicative actions. As a prototype, a virtual conversational agent with these abilities in a calendar planning domain will be developed and implemented. Additional specific outcomes of the project are (1) a corpus of empirical data on multimodal feedback use in human-human dialogue, (2) manuals and coding guidelines for the annotation and analysis of feedback behavior, (3) computational techniques and models for the recognition and interpretation of vocal and nonverbal user feedback, priming-based sentence planning and incremental adaptation.