Projects

Throughout my research career, I worked on human-like agents that combine gestures, facial expressions, dialog models, and speech to enable face-to-face communication between users and computer-generated characters. My research focused on improving the naturalness and effectiveness of human-agent communication. Later, I commercialized this research by developing training applications and assistive technologies for enterprise use cases. Now, as the deputy chief technology officer for El Paso, TX, the 22nd largest city in the US, I lead our most significant software projects.

Here are some of the largest, most fun projects I worked on, listed chronologically, starting with the most recent. Scholarly publications related to these projects are available here.

Virtual Patients (Augmented Reality / HoloLens) 2019 – 2020

Led the development of six holographic augmented reality experiences representing medical patients with speech capabilities. Each patient is displayed through a HoloLens 1 or HoloLens 2, while the user of the training system must follow procedures or diagnose the patients. An additional hologram representing a medical professional can appear to assist users while diagnosing these patients. Virtual patients utilize an updated version of the open-source platform for creating virtual humans, which I co-developed and known as the Virtual Agent Interaction Framework (VAIF). This project was possible thanks to the work of Alex Rayon, Laura Avila, and Julie Hinojos, among many others.

Smart Guides (Augmented Reality / HoloLens, Vuzix, ODG, Android, iOS) 2018 – 2019

Developed a guidance system that would record procedures from head-mounted devices from an expert’s point of view and then categorize them and load them on other devices for offline access. These recorded clips could then be accessed hands-free through voice commands for inexperienced users. Additional capabilities, such as displaying holographic blueprints, 3D models, videos, images, and diagrams, were added on a case-by-case basis. The following video is a technical demo, not a commercial application to protect proprietary information and content.

Telepresence (Augmented Reality / Vuzix, ODG, Android, iOS) 2017 – 2019

Telepresence was one of the first applications to offer augmented reality-enhanced telecommunications. It is available for PC, Android, iOS, and some augmented reality HMDs. Telepresence allows users to share their point of view through a video stream and provides augmented reality tools to draw, mark, point, and interact as if everyone were in the same room.

Virtual Training (Automotive) 2017

Developed the initial version of an automotive virtual reality training system for quality assurance. This was an interactive application for steering wheels manual inspection training and the project that launched Boost Human as a startup. Test administrators would configure a test, including items, defects, difficulty, speed, and other industry standard regulations. Users will then grab virtual steering wheels (over 2000 different items), inspect them, and classify them. The application would return test results at the end and evaluate users across attempts while providing other statistical information. The latest version, look, and feel of this product are possible thanks to the work of Marco Antonio Lopez.

Unreleased Prototypes: Dino Guides, Home Tours, Anatomy AR (Augmented Reality / Android, iOS)

I led the development of three unreleased applications developed by the Boost Human team. The latter two exist thanks to Juan Antonio Lopez’s great work.

Dino guides allowed people to walk a 2-mile hiking trail in the desert, with visual and auditory queues that allowed them to find real dinosaur tracks scattered along the trail. Additional information and 3D walking dinosaurs would also appear. The prototype was created for the museum that manages the park.

Home Tours is a prototype for a home builder, where they can load their 3D home models in actual construction lots to scale, allowing users to walk inside and throughout the houses and even check the view through those windows. Halted development after adding some controls for furnishing the spaces.

Anatomy AR was developed as a tech demo to explain what mobile AR looks like to our education sector customers. It is a 3D representation of the human body (muscles and bones) in AR. Users can walk around, filter, and select body parts. Later, we added a quiz option to gamify learning.

Boston (History Education) 2016 – 2017

The Boston Massacre History Experience is a multimodal interactive virtual-reality application directed at eighth-grade students by my colleague Laura Rodriguez. I co-directed this project in its early stages and later transitioned to a technical adviser. In the application, students converse with seven embodied conversational agents, representing characters such as John Adams, a tea shop owner, and a redcoat. The experience ends with a conversation with Abigail Adams for students to explain what they learned and a narration by John Adams about events that followed. The application is technically novel because it provides a form of user initiative in conversation and features an unprecedented number of virtual agents.

Gods of the Neon City (Hidden Agent Motives) 2016 – 2017

Gods of the Neon City is a speech-enabled virtual reality game. You follow a private investigator with apparently hidden motives. Throughout the adventure, you visit futuristic cyber-punk environments that reveal a conspiracy spanning several years that caused the character’s home city to flood. During the experience, you unravel the agent’s goals and motives through your dialog choices, potentially reaching conflicting goals. The project’s original intention was to measure trust-building activities between humans and virtual agents. The game was directed by me and my colleague Alex Rayon and was developed with the help of over 20 student volunteers.

Merlin (Speech Recognition in Virtual Reality) 2015 – 2016

As you open your eyes, you notice a young man standing before you. He waves his arms around, surprised and perhaps scared. He approaches you cautiously, wanting to know your name and how you got there. He awaits an answer, and you start talking through a microphone. Thus begins your new virtual-reality adventure. This storytelling adventure showcases Inmerssion’s technology, taking you and Merlin through peaceful villages, haunted forests, and ancient ruins. Explore and discover new places, befriend Merlin, learn to cast spells, and make decisions that will change the course of your story and your relationship with Merlin. Watch the short demonstration here.

Harry Potter (IBM Watson Smart Agent Integration) 2014 – 2015

Have you ever wondered what talking to your favorite character would be like? Inmerssion developed a virtual Harry Potter that you can talk to! Through speech recognition, people can ask him questions about his life and adventures. We aimed to create a natural interaction between a virtual agent and a human. I lead the development effort to recreate Harry Potter’s knowledge base, feeding IBM’s Watson all Harry Potter books to process and building a natural way for people to interact with it. This is great for Harry and a great tool to promote books, video games, movies, or other performances, and it has other uses for training and simulation.

Virtual Rapport (Adventure) Project 2012 – 2015

The agent, Adriana, leads the user through activities and conversations while playing a game. The game simulates a survival scenario where you must collaborate, cooperate, and build a relationship with the ECA to survive a week on a deserted island. This simulation is built to maximize rapport-building opportunities and take advantage of the non-verbal behaviors in a more immersive environment, where both the user and the agent can interact with the same objects in virtual space. The storyline allows flexibility and decision-making without creating a completely open environment where tasks would otherwise be complex to set up and evaluate. In addition, Adriana is capable of soliciting personal information and establishing small talk in a carefully controlled environment. The ECA can solicit information verbally by asking the user several questions based on adaptive questionnaires.

Familiarity (Vampire) Project 2012 – 2014

The research we report here forms part of a longer-term project to provide embodied conversational agents (ECAs) with behaviors that enable them to build and maintain rapport with their human partners. We focus on paralinguistic behaviors, especially nonverbal behaviors, and their role in communicating rapport. Accordingly, this study piloted the investigation of how to signal increased familiarity over repeated interactions as a component of rapport. We studied the effect of differences in the amplitude of nonverbal behaviors by an ECA interacting with a human across two conversational sessions. Our main question was whether subjects would perceive more rapport with the agent in the increased familiarity condition in the second session as having higher rapport.

Multiparty Agents (Enter the Room) Project, 2012

This is an exploratory study of what happens when a person enters a room where people are conversing, based on an analysis of 61 episodes drawn from the UTEP-ICT cross-cultural multiparty multimodal dialog corpus. We examine the reliability of coding of gaze and stance, and we develop a model for room entry that accounts for observed differences in the behaviors of conversants, expressed as a state-transition model. Our model includes factors such as conversational tasks not considered in existing social-force models, which appear to affect conversants’ behaviors. We then applied this model to four embodied conversational agents that reacted accordingly when a person entered the room where they conversed. Here is where we first tested our virtual environment and virtual characters.

Grounding and Turn-Taking in Multimodal Multiparty Conversation, 2012

This study explores the empirical basis for multimodal conversation control acts. Applying conversation analysis as an exploratory approach, we attempt to illuminate the control functions of paralinguistic behaviors in managing multiparty conversation. We contrast our multiparty analysis with an earlier dyadic analysis and, to the extent permitted by our small samples of the corpus, contrast (a) conversations where the conversants did or did not have an artifact and (b) conversations in English among Americans with conversations in Spanish among Mexicans. Our analysis suggests that speakers tend not to use gaze shifts to cue nodding for grounding and that the presence of an artifact reduces listeners’ gaze at the speaker. These observations remained relatively consistent across the two languages.

The Lab

For most of these projects I worked at the University of Texas at El Paso (UTEP), with the Interactive Systems Group (ISG). The Immersion lab features a projection room. Subjects stand in the middle of the room and interact with virtual agents while we record them with strategically placed cameras and Kinect sensors. The projection is displayed on a 15 by 10 feet wall, and the surrounding physical space is decorated to match the virtual environment of the current experiment. Lately, the team’s efforts have been redirected into virtual reality applications.