I develop full-body embodied conversational agents and immersive environments at UTEP’s Interactive Systems Group, as part of the Advanced Agent Engagement Team. I work on building engaging interactions between humans and virtual agents that build rapport over time.
What is an Embodied Conversational Agent?
The term Embodied Conversational Agent (ECA) refers to a form of human-computer interaction, represented by intelligent agents that live in a virtual environment and communicate through elaborate user interfaces. Graphically embodied agents can take almost any form, often human-like, and aim to unite gesture, facial expression and speech to enable face-to-face communication with users, providing a powerful means of human-computer interaction (Cassell, 2000) .
What’s the problem?
ECAs are not humans. Humans know that, and thus we interact with them as if we were interacting with a computer (which is true). My research aims to improve naturalness and effectiveness in ECA interfaces by increasing rapport between agents and humans.
“The more human the human thinks it is being perceived, the more human the human behaves.”
For example, if the human realizes that not looking at a talking ECA in a computer screen is perceived as an inattentive or even rude behavior and the ECA reacts to it, then it is likely that the human will pay attention to the screen.
What is rapport in the first place?
There are many definitions of rapport, and many different aspects to consider. In short, rapport is the feeling of mutual understanding, or the sensation of being “in sync” with another person. I attempt to build these feelings across a human-ECA interaction.
Here is a short video where we briefly explain what we do:
For a complete list of publications go here.
Gris, I., Novick, D., Camacho, A., Rivera, D. A., Gutierrez, M., & Rayon, A. (2014, January) Recorded Speech, Virtual Environments, and the Effectiveness of Embodied Conversational Agents In Intelligent Virtual Agents (pp. 182-185). Springer International Publishing.
Novick, D., & Gris, I. (2014). Building Rapport between Humans and ECAs: A Pilot Study. In Human-Computer Interaction. Advanced Interaction Modalities and Techniques (pp. 472-480). Springer International Publishing.
Gris, I. (2013, December). Adaptive Virtual Rapport for Embodied Conversational Agents. In Proceedings of the 15th ACM on International conference on multimodal interaction (pp. 341-344). ACM.
Novick, D., & Gris, I. (2013). Grounding and Turn-Taking in Multimodal Multiparty Conversation. In Human-Computer Interaction. Interaction Modalities and Techniques (pp. 97-106). Springer Berlin Heidelberg.
Novick, D., Adoneth, G., Manuel, D., & Grís, I. (2012) When the Conversation Starts, an Empirical Analysis. Workshop on Real-Time Conversations with Virtual Agents held at the “12th International Conference on Intelligent Virtual Agents (IVA 2012)” Santa Cruz, California
You can click on the titles for more information.
The agent, Adriana, leads the user through a series of activities and conversations while playing a game. The game simulates a survival scenario where you have to collaborate, cooperate, and build a relationship with the ECA to survive a week in a deserted island. This simulation is built with the intention to maximize rapport building opportunities, as well as to take advantage of the non-verbal behaviors in a more immersive environment, where both, the user and the agent can interact with the same objects in virtual space. The storyline allows the necessary flexibility and decision making, without creating a completely open environment where tasks would otherwise be difficult to set up and evaluate. In addition Adriana is capable of soliciting personal information and establishing small-talk in a carefully controlled environment. The ECA is able to solicit information verbally by asking the user several questions based on adaptive questionnaires.
The research we report here forms part of a longer-term project to provide embodied conversational agents (ECAs) with behaviors that enable them to build and maintain rapport with their human partners. We focus on paralinguistic behaviors, and especially nonverbal behaviors, and their role in communicating rapport. Accordingly, this study piloted the investigation of how to signal increased familiarity over repeated interactions as a component of rapport. We studied the effect of differences in the amplitude of nonverbal behaviors by an ECA interacting with a human across two conversational sessions. Our main question was whether subjects would perceive more rapport with the agent in the increased-familiarity condition in the second session as having higher rapport.
This is an exploratory study of what happens when a person enters a room where people are conversing, based on an analysis of 61 episodes drawn from the UTEP-ICT cross-cultural multiparty multimodal dialog corpus. We examine the reliability of coding of gaze and stance, and we develop a model for room-entry that accounts for observed differences in the behaviors of conversants, expressed as a state-transition model. Our model includes factors such as conversational task, not considered in existing social-force models, which appear to affect conversants’ behaviors. We then applied this model to a set of four embodied conversational agents that reacted accordingly when a person entered the room in which they were conversing.Here is where we first tested our virtual environment and virtual characters.
This study explores the empirical basis for multimodal conversation control acts. Applying conversation analysis as an exploratory approach, we attempt to illuminate the control functions of paralinguistic behaviors in managing multiparty conversation. We contrast our multiparty analysis with an earlier dyadic analysis and, to the extent permitted by our small samples of the corpus, contrast (a) conversations where the conversants did or did not have an artifact, and (b) conversations in English among Americans with conversations in Spanish among Mexicans. Our analysis suggests that speakers tend not to use gaze shifts to cue nodding for grounding and that the presence of an artifact reduced listeners’ gaze at the speaker. These observations remained relatively consistent across the two languages.
We have a great immersion lab. It’s a place where we can project across a whole wall and record with some Kinects and cameras. Check it out.
Our group is composed of very talented people:
David Novick – Professor in Computer Science, Principal Investigator
Ivan Gris – Ph.D. Student in Computer Science, Post Doctoral Researcher
Diego Rivera – B.S. Student in Computer Science, Network, Modeling and Animations
Carolina Camacho – B.S. Student in Computer Science, Gesture Recognition and Mo-Cap and Voice Actor
Alex Rayon – B.S. Student in Computer Science, Scene Script Writer, Gesture Analyst and Level Designer
Past members include: Mario Gutierrez, Anuar Jauregui, Angelica Martinez, Keicha Myers , Joel Quintana, Juan G Vicario and Baltazar Santaella