This section describes my academic research with virtual agents. Academic publications are available here.
Throughout my academic career I worked on building engaging interactions between humans and virtual agents that build rapport over time. I worked on human-like agents that aim to unite gesture, facial expression, dialog models, and speech to enable face-to-face communication between users and computer-generated characters. My research focused on improving the naturalness and effectiveness of human-agent communication by creating rapport (the feeling of mutual understanding, or the sensation of being “in sync”) with them.
Later, my work shifted to the development of training applications and assistive technologies for enterprise use-cases, resulting in the creation of Boost Human.
Currently, I work in creating new interfaces and interaction techniques within mixed reality and all its variants, and on improving experiences through the use of powerful edge compute clusters and networks.
Below are some samples and videos of some projects resulting from these efforts in chronological order. None of these projects would have been possible without the work of dozens of individuals supporting its design and development.
Virtual Patients (Augmented Reality / HoloLens) 2019 – 2020
Led the development of six holographic augmented reality experiences representing medical patients with speech capabilities. Each patient is displayed through a HoloLens 1 or HoloLens 2 while the user of the training system must follow procedures or diagnose the patients. An additional hologram representing a medical professional can appear to assist users while diagnosing these patients. Virtual patients utilize an updated version open source platform for creating virtual humans I developed in 2017 called Virtual Agent Interaction Framework (VAIF). This project was possible thanks to the work of Alex Rayon, Laura Avila, and Julie Hinojos among many others.
Smart Guides (Augmented Reality / HoloLens, Vuzix, ODG, Android, iOS) 2018 – 2019
Developed a guidance system that would record procedures from head mounted devices from an expert’s point of view, and then categorize them and load them on other devices for offline access. These recorded clips could then be accessed hands free through voice commands for inexperienced users to follow along. Additional capabilities such as displaying holographic blueprints, 3D models, videos, images, diagrams were added on a case-by-case basis. The following video is just a technical demo and not a commercial application to protect proprietary information and content.
Telepresence (Augmented Reality / Vuzix, ODG, Android, iOS) 2017 – 2019
Telepresence was at the time one of the first applications to offer augmented reality enhanced telecommunications. Available for PC, Android, iOS, and some augmented reality HMDs. Telepresence allows users to share their point of view through a video stream and provides a set of augmented reality tools to draw, mark, point, and interact as if everyone was in the same room.
Virtual Training (Automotive) 2017
Developed the initial version of an automotive virtual reality training system for quality assurance. This was an interactive application for steering wheels manual inspection training, and the project that launched Boost Human as a startup. Test administrators would configure a test, including items, defects, difficulty, speed, and other industry standard regulations. Users will then grab virtual steering wheels (over 2000 different items), inspect, and classify them. The application would return test results at the end and evaluate users across attempts while providing other statistical information. The latest version, look, and feel of this product is possible thanks to the work of Marco Antonio Lopez.
Unreleased Prototypes: Dino Guides, Home Tours, Anatomy AR (Augmented Reality / Android, iOS)
Led the development of three unreleased applications developed by the Boost Human team, the later two exist thanks to the great work of Juan Antonio Lopez.
Dino guides allowed people to walk along a 2-mile hiking trial in the desert with some visual and auditory queues that would allow them to find real dinosaur tracks scattered along the trail. Additional information and 3D walking dinosaurs would also appear. The prototype was created for the museum that manages the park.
Home tours is a prototype for a home builder, where they can load their 3D home models in actual construction lots, to scale, allowing users to walk inside and throughout the houses, even check the view through those windows. Halted development after adding some controls for furnishing the spaces.
Anatomy AR was developed as a tech demo to explain to our education sector customers what mobile AR looks like, It is a 3D representation of the human body (muscles and bones) in AR. It allows users to walk around, filter, and select body parts. Later we added a quiz option to gamify learning.
Boston (History Education) 2016 – 2017
The Boston Massacre History Experience is a multimodal interactive virtual-reality application aimed at eighth-grade students, directed by my colleague Laura Rodriguez. I co-directed this project in its early stages and later transitioned to a technical adviser. In the application, students students have conversations with seven embodied conversational agents, representing characters such as John Adams, a tea-shop owner, and a redcoat. The experience ends with a conversation with Abigail Adams for students to explain what they learned and a narration by John Adams about events that followed. The application is novel technically because it provides a form of user initiative in conversation and features an unprecedented number of virtual agents, some 486 in all.
Gods of the Neon City (Hidden Agent Motives) 2016 – 2017
Gods of the Neon City is a speech-enabled virtual reality game. You follow a private investigator with apparent hidden motives. Throughout the adventure you visit futuristic cyber-punk environments that reveal a conspiracy spanning several years that caused the character’s home city to flood. During the experience you unravel the agent’s goals and motives through your dialog choices, potentially reaching conflicting goals. The original intention of the project was to measure trust-building activities between humans and virtual agents. The game was directed by me and my colleague Alex Rayon, and was developed with the help of over 20 student volunteers.
Merlin (Speech Recognition in Virtual Reality) 2015 – 2016
As you open your eyes a young man stands in front of you. He waves his arms around, surprised, and perhaps scared. He approaches you cautiously, wanting to know your name and how you got there. He awaits an answer, and you start talking through a microphone. Thus begins your new virtual-reality adventure. This storytelling adventure showcases Inmerssion’s technology, taking you and Merlin through peaceful villages, haunted forests, and ancient ruins. Explore and discover new places, befriend Merlin, learn to cast spells, and make decisions that will change the course of your story and your relationship with Merlin.
Harry Potter (IBM Watson Smart Agent Integration) 2014 – 2015
Have you ever wondered what it would be like to talk to your favorite character? Inmerssion developed a virtual Harry Potter that you can talk to! People can ask him questions about his life and adventures through speech recognition. Our goal was to create a natural interaction between a virtual agent and a human. I lead the development effort to recreate Harry Potter’s knowledge base, feeding IBM’s Watson all Harry Potter books for it to process, and on building a natural way for people to interact with it. This is great for Harry, and a great tool to promote books, video games, movies, or other performances, and has other uses for training and simulation.
Virtual Rapport (Adventure) Project 2012 – 2015
The agent, Adriana, leads the user through a series of activities and conversations while playing a game. The game simulates a survival scenario where you have to collaborate, cooperate, and build a relationship with the ECA to survive a week in a deserted island. This simulation is built with the intention to maximize rapport building opportunities, as well as to take advantage of the non-verbal behaviors in a more immersive environment, where both, the user and the agent can interact with the same objects in virtual space. The storyline allows the necessary flexibility and decision making, without creating a completely open environment where tasks would otherwise be difficult to set up and evaluate. In addition Adriana is capable of soliciting personal information and establishing small-talk in a carefully controlled environment. The ECA is able to solicit information verbally by asking the user several questions based on adaptive questionnaires.
Familiarity (Vampire) Project 2012 – 2014
The research we report here forms part of a longer-term project to provide embodied conversational agents (ECAs) with behaviors that enable them to build and maintain rapport with their human partners. We focus on paralinguistic behaviors, and especially nonverbal behaviors, and their role in communicating rapport. Accordingly, this study piloted the investigation of how to signal increased familiarity over repeated interactions as a component of rapport. We studied the effect of differences in the amplitude of nonverbal behaviors by an ECA interacting with a human across two conversational sessions. Our main question was whether subjects would perceive more rapport with the agent in the increased-familiarity condition in the second session as having higher rapport.
Multiparty Agents (Enter the Room) Project, 2012
This is an exploratory study of what happens when a person enters a room where people are conversing, based on an analysis of 61 episodes drawn from the UTEP-ICT cross-cultural multiparty multimodal dialog corpus. We examine the reliability of coding of gaze and stance, and we develop a model for room-entry that accounts for observed differences in the behaviors of conversants, expressed as a state-transition model. Our model includes factors such as conversational task, not considered in existing social-force models, which appear to affect conversants’ behaviors. We then applied this model to a set of four embodied conversational agents that reacted accordingly when a person entered the room in which they were conversing. Here is where we first tested our virtual environment and virtual characters.
Grounding and Turn-Taking in Multimodal Multiparty Conversation, 2012
This study explores the empirical basis for multimodal conversation control acts. Applying conversation analysis as an exploratory approach, we attempt to illuminate the control functions of paralinguistic behaviors in managing multiparty conversation. We contrast our multiparty analysis with an earlier dyadic analysis and, to the extent permitted by our small samples of the corpus, contrast (a) conversations where the conversants did or did not have an artifact, and (b) conversations in English among Americans with conversations in Spanish among Mexicans. Our analysis suggests that speakers tend not to use gaze shifts to cue nodding for grounding and that the presence of an artifact reduced listeners’ gaze at the speaker. These observations remained relatively consistent across the two languages.
For most of these projects I worked at the University of Texas at El Paso (UTEP), with the Interactive Systems Group (ISG). The Immersion lab is the space features a projection room. Subjects stand in the middle of the room and interact with virtual agents while we record them with strategically placed cameras and Kinect sensors. The projection is displayed on a 15 by 10 feet wall, and the surrounding physical space is decorated to match the virtual environment of the current experiment. Lately the team’s efforts have been redirected into virtual reality applications.