Getting Machines to Understand Human Emotions

This blog post was originally published at Intel’s website. It is reprinted here with the permission of Intel.

Throughout season two of the Intel on AI podcast, many of the world’s most prominent artificial intelligence (AI) experts have talked about why they believe machines can’t simply improve performance on large data sets in order for humanity to realize its true potential. AI machines will have to evolve to learn from their surroundings and the actions of people out in the world.

State-of-the-art results are great, I like a good benchmark-busting as much as the next person, but what I need—what we all need—is to see better machines enter our daily lives. In several podcast episodes, we’ve heard experts talk about the need for artificial intelligence to cope with the messy complexity of the real world, and thus far, we’ve mostly meant the physical complexity of our environment. In episode eleven of the podcast with host Intel Tech Evangelist Abigail Hing Wen, we explore a different sort of complexity. Rana el Kaliouby, Ph.D. cofounder and CEO of Affectiva and author of Girl Decoded: A Scientist’s Quest to Reclaim Our Humanity by Bringing Emotional Intelligence to Technology, joins the program to talk about the need to bring emotional intelligence (EQ) into AI.

“I believe that at some point in the future, the default human machine interface will be perceptual, will be conversational, will have empathy and it will mirror how humans communicate with one another.”

– Rana el Kaliouby

Academia to Industry

As Rana tells it, she went from studying computer science as an undergraduate in Cairo to spending her time interacting with machines more than humans while working on her Ph.D. at Cambridge. There she had the idea, “What if I could leverage computer vision to build technology that could read my emotions via my facial expressions?” Even before the isolating events of COVID-19, Rana was using technology to communicate with friends and family back in Egypt and found herself missing the wealth of human cues we all pick up on when talking together face-to-face.

After her Ph.D., Rana joined MIT as a post-doc with Professor Rosalind Picard, author of the book Affective Computing. Together, the pair worked on a Google Glass-like device to help autistic children read and understand nonverbal cues by providing the wearer real-time feedback about the people in front of them. In 2009, following industry interest at a demo day, the two launched the company Affectiva to bring EQ into a number of fields.

EQ Applications

Rana sees EQ being introduced to a variety of AI products and fields, from customer service aspects such as teaching voice command systems like Amazon’s Alexa or Apple’s Siri to respond in more pleasing and accurate fashion to safety mechanisms in automobiles. While companies like Intel’s Mobileye are working to address the complexity of the world outside the vehicle in order to bring self-driving cars to market by 2025, Affectiva is focusing inward, developing in-cabin sensing solutions that can predict if passengers or assistant drivers are getting drowsy or to get help when a child has been left in the car. In my own work, I’ve had a few challenges in collecting data, but nothing to compare with the challenges that Affectiva has overcome. If you work on machine learning projects, listening to Rana talking about the evolution of their data collection strategy is worth your time. Affectiva has been profiled by the Harvard Business School in a case study on how to expand into new business verticals, and Rana predicts the type of technology her and her team are developing will be inside vehicles within the next three to five years.

Rana sees applications for EQ in all sorts of systems that interact with people: phones, cars, media platforms, health care and so on. I had initially thought of EQ as a route to making machines less frustrating to interact with, but there is potential to go far beyond a better UX. By quantifying a person’s facial and vocal biomarkers, systems can indicate level of stress, anxiety, and depression. Such insight could allow the development of a more objective approach to mental health. Seeing the lives of close relatives blighted by chronic mental illness, I’m especially keen to see progress in this area. Perhaps the equivalent of indicators like blood sugar levels or resting heart rate can help us to improve the timeliness and effectiveness of treatment.

Realizing the promise of EQ is tough though! EQ systems need to collect broad data sets with sufficient sample sizes from cultures and communities all over the world in order to capture the full range of emotional cues in humans. If your travels have taken you to both the US and Europe, for example, you’ll probably have had to recalibrate for smiles (big in the US, less so over here). Body language comes with huge cultural variations. Coming from the British Isles, my biggest “culture shock” moment came when, as an intern in Italy, I walked through the office with the CEO, who (this being our first meeting) quite casually draped his arm around my shoulder and kept it there. (We in the UK embrace too, of course—if we are old friends or close family and someone has just been born, married, or buried). In fact, both body language and facial expression vary enormously from culture to culture, something documented in fascinating detail by Lisa Feldman Barrett in her book How Emotions are Made: The Secret Life of the Brain.

AI Safety and Misuse

Quality are safety come cheaper and easier if we design them in from the beginning. Rana notes that when she was a Ph.D. student twenty years ago, algorithms were rated on one accuracy score. For example, if a scientist wanted to test how accurate something like a smile classifier was, they would look measure performance using this one metric on one single dataset. One number can win a “state-of-the-art” paper for a research paper but is never good enough for a product. Perhaps your test dataset had was built in LA and you’ll also be deploying in Moscow? It is important to know not only how well your model does on average but how well it does in the “worst case” scenario—just like we evaluate cars not only on ride quality but on crash-test resilience. Identifying failure modes is the right thing to do, but also the smart thing to do, because failures are signposts to know how we can build a better product.

Rana is concerned about the potential for AI to be used to mechanize discrimination and for manipulation. Affectiva refuses to work on surveillance or security or lie detection applications because of this fear. Rana, like previous podcast guests Intel Vice President Sandra Rivera and MIT professor Bernhard Trout, sees AI as an amplifier of human decisions. Like Uncle Ben told Peter Parker (aka Spider-Man) in the Marvel comic book: “With great power comes great responsibility.”

If you're building AI or vision-enabled products, you've come to the right place.

Getting Machines to Understand Human Emotions

Academia to Industry

EQ Applications

AI Safety and Misuse

Further Reading

Pages

Topics

Contact

Address

Phone