Artificial Emotional Intelligence

Speaker:  Javier Gonzalez-Sanchez – Tempe, AZ, United States
Topic(s):  Artificial Intelligence, Machine Learning, Computer Vision, Natural language processing

Abstract

This lecture presents a pragmatic view of the machine learning workflow to build Emotion AI. Emotion AI (artificial emotional intelligence) is a subset of artificial intelligence that studies and develops systems and devices to recognize, interpret, process, simulate, and react to human emotions. It is an interdisciplinary field spanning computer science, psychology, and cognitive science and represents a step forward in human-computer interaction. 

Imagine a world in which machines have emotional intelligence, i.e., they can understand the emotional state of humans and adapt their behavior to give appropriate (empathetic) responses to those emotions. Consider these examples: (1) an intelligent tutor detecting students’ affect can realize and respond to a student’s need for support, such as to provide encouraging comments, alter the level of feedback and hints, or adjust task difficulty; (2) a video game can become more compelling by using players’ affect as input to alter and adjust the gaming environment, such as lighting, music, colors, complexity, or level of companionship; (3) an avatar in a virtual world can mirror a human’s affective expressions and become more believable, likable, trustable, and enjoyable; and, (4) a healthcare application can provide empathetic interventions and motivational support to offer assistance and empower patients to improve their quality of life.

Detecting emotional information begins by capturing data about a human physiological state or behavior. The data gathered is analogous to the cues that humans use to perceive emotions in others. A video camera might capture facial gestures and body posture, while a microphone might capture speech tones. Specialized devices can be used to detect emotional cues by directly measuring physiological data such as skin temperature or heart rate, and others can even detect brain activity or track the eyes’ movements and pupil dilation. Meaningful patterns can then be extracted from the gathered data using machine learning techniques. The goal of these techniques is to produce labels that would match the labels a human perceiver would give in the same situation. For example, if a person makes a facial expression furrowing their brow, then the computer might be taught to label it as appearing confused. 

The key questions to be answered include: What data is used? What pre-processing is needed? What models would be best for each type of data? How can diverse channels be fused? And, what are the challenges for training and testing models? 

About this Lecture

Number of Slides:  55
Duration:  60 minutes
Languages Available:  English, Spanish
Last Updated: 

Request this Lecture

To request this particular lecture, please complete this online form.

Request a Tour

To request a tour with this speaker, please complete this online form.

All requests will be sent to ACM headquarters for review.