top of page
NBUX-Logo-_2x.png

VANESSA SANCHEZ

Sentiments on Emotion AI in Hiring

The University of Texas at Austin // 2023

Co Workers

Can emotion AI objectively deduce behaviors from emotion-tracking data? Users are doubtful.

AI’s challenges with transparency and explainability have become engrained through all stages of the hiring process in the last decade. In this pilot study, we designed a mock interview experiment to quantify the impact of AI-driven facial emotion recognition. 

 

We conducted 9 remote mock interviews and analyzed the answers using an open source Facial Expression Recognition (FER) model on Python used for sentiment analysis of images and videos. We curated individualized analytics to understand the impact of AI emotion-tracking on video interviews and how such tools can be used for effective mock video interview preparation.

 

We found that while facial recognition adds complexity and stress in interview settings, emotion-tracking outputs can be used for increased self awareness in behavioral interviews. We hope to empower people interviewed with AI and encourage transparency and helpful feedback loops from AI interview-prep companies.

EAI-Prototype.png

"Job applicants who don't fit the benchmark data may experience encoded bias at scale..."

Research Questions
  1. How does emotion-tracking/EAI make participants feel?
  2. What information do participants want to see in their EAI results?
  3. What is the best visualization of emotion-tracking reports?

Objectives for Participants
  • Improve awareness of the increasing prevalence of AI in hiring processes and how AI works in this context
  • Enable them to identify what kind of visual feedback they find most useful
  • Invite them to express their opinions about the use of AI in hiring practices
​​
Impact to Stakeholders
It is not uncommon for job applicants to go through several automated phases in the job application process before their application materials are viewed by a human. Job applicants who don't fit the benchmark data (physically or behaviorally) may experience encoded bias at scale and usually don't have the interventional agency to alter hiring practices. Hiring companies meaning to streamline their hiring processes and eliminate human bias through these tools may actually be unintentionally using discriminatory hiring practices and missing out on qualified candidates while creating a lack of diversity in their workforce. Creators of EAI-based facial analysis tools may lack diversity in algorithmic training data resulting in inaccurate and even harmful products.

Contributing UX and leadership

Role & Skills
I developed the research methodology, managed the project timeline and defined deliverables. I also helped conduct interviews and designed customized interactive reports for each participant. Skills included UX, prototyping, survey design and analysis, remote interviews, project management, presentation design, academic writing, and IRB human subjects research training.

Team
Vanessa Sanchez, HCI & Responsible AI MSc Student: Project Management, UX Research and Design
Kyle Soares, Computer Science MSc Student: Application Research and Development
Dhanny Indrakusuma, Data Science MSc Student: Data Analysis and Data Visualization
Silvia Dalben Furtado, AI in Journalism PhD Candidate: Literary Research and Critical Analysis


Tools
Figma, Google Survey, Google Docs, Google Sheets, Zoom

Timeline
3 Months

1x1 interviews with 9 participants followed by EAI reports

9 Remote Mock 1x1 Interviews on Zoom
  • Team members scheduled Zoom interviews over 1 week
  • Verbal informed consent to record was obtained at start of sessions
  • Interviewers followed a script and asked 3 behavioral questions designed to elicit neutrality (baseline), confidence, and stress:
    1. "Can you tell me about yourself?"​
    2. "Can you tell me about a time you went above and beyond?"
    3. "Can you tell me about a time you overcame a team conflict or challenge?"
  • We ended with 3 post-interview questions:
    1. "(List the 7 emotions) What emotions do you think you displayed the most?"​
    2. "How do you feel about AI analyzing your performance?"
    3. "Have you ever used a mock video interview tool to practice? Which one?"
9 Follow-up Interviews
We conducted follow-up interviews with participants to get their reactions to individualized EAI reports.​

5 out of 9 participants agreed with the EAI analysis while many were surprised and wanted to know more

Video is shown with participant's permission.

"It's not just my expression that matters; what about my voice, my body language, etc?"

Participant Insights
  • Sentiment: Overall positive response to reports
  • Ground Truth: 5/9 participants agreed with the EAI analysis while many were surprised
  • Concerns: Most participants were concerned about usage of EAI analysis in future interviews, especially if it makes final hiring decision
  • Satisfaction: 7/9 participants would use EAI tool again for interview prep
​
Impact on participants​
Participants expressed a range of attitudes and levels of curiosity about the EAI that was analyzing them:
  • "[The AI] made me curious and made me wonder why it's answering the way it is."
  • "I feel I have to exaggerate facial expressions to convey positive emotions. It's not natural."
  • "I wasn't really thinking about it."
  • "I'd rather start my own business than use AI to become someone I'm not."
  • "It's not just my expression that matters; what about my voice, my body language, etc.?"​
​
Performance against objectives​
Our team not only met all objectives we set for ourselves--we went above and beyond, leveraging our unique skill sets to achieve something none of us could have done on our own. Overall, we collaborated successfully, hit milestones and had a successful outcome.
​
We recognized some factors that may have impacted our results:​
  • Limited access to controlled interview environmentsResults might be affected by camera angle or lighting
  • Difficult to manufacture realistic behaviors in participants for "fake interview"
  • Participants may have had different reactions based on who was interviewing them; although our team members had scripts to follow and were paired with strangers, some of us deviated from the script, some of us chose to be more personable, and some of us were intentionally flat.
  • Limitations of small sample size and broad range of participants prevented us from achieving saturation in this pilot study.

Reflection

Process Insights
  • By beginning with a detailed research plan, I was able to streamline process, keep the team aligned, and deliver on goals.
  • Align on goals and protocols so everyone is rowing in the same direction and findings are defensible.
  • Reality will force plans to shift, so stay nimble, have some backup plans, and discuss as a team how this impacts the study.
  • Meet people where they're at to enable fruitful discussions, both within the team and with research participants.
​
Design insights
  • Provide a timebar on video playback and AI analysis chart so user can see the correlation
  • Users would like to connect data performance to behavioral insights and suggestions for improvement
  • Users would like to evaluated holistically, not just by facial expression
  • Practicing with a real person via zoom vs. practicing alone with an AI interface may yield different results
  • Benchmarking performance against other users interested participants but also caused some discomfort​
​
Future work​
  • Emotional analysis as prep tool: Can emotional analysis be used with NLP analysis for better interview preparation?
  • Bias study: Who is most affected by AI emotional analysis in interviews?
  • Connect emotions to performance: Which emotions are best for job offers?

Data visualizations embedded in an interactive report engage participants & prompt questions of benchmark data and interpretation

Ideation and iteration on prototype of an interactive report
We explored a few ideas on what information we could present to participants, the data visualization style, and the report interface. After some team brainstorming and seeing the data visualizations we would be using, I designed an interactive prototype of a customized report UI where we could embed the data visualizations and the participant's mock interview clips. To aid and follow along with the session script, I also included text screens for the case study summary (HireVue 2020) and our follow-up questions.
EAI-Design-Decisions.png
Design Decisions
  • Provide participants with data visualizations on their individual EAI readings overall and how their overall readings compared against the rest of the participants' in aggregate; This offers high-level points of comparison.
  • Show data visualizations in various formats of individual responses beside the video clip of that response for comparison.

  • Simplify the data visualizations from 7 emotions to 3 basic emotions (positive, negative, neutral) to create a visual that is easier to read.

  • Eliminate the CSV output option as the average person would not find value in this format.

  • Provide an interactive report format that participants can click through.

  • Personalize the report by putting participant's name at the top.

  • Reinforce the AI element by including an illustrated robot avatar in the header, but keep it friendly.

Multi-phase research that leveraged our team's diverse skill sets, mitigated bias, and prioritized ethics

Literary Research​
Literary research was led by Silvia DalBen, PhD candidate. This step was foundational to inform the study for several reasons. Our research helped us understand what prior work had been done. We were able to fill a gap while building on top of existing work. In her research she found the following:
 
In a study conducted by Langer et al (2016), virtual interview training reduced candidates' anxiety, improved their non-verbal behavior and increased their chances to receive a job offer. Meanwhile, Harwell (2019) emphasizes HireVue could penalize non-native speakers, visibly nervous interviewees, and those who do not fit within the model of look and speech. According to Suen, Chen & Lu (2019) and Langer et al (2017), candidates are less in favor of asynchronous interviews because they must watch themselves answering questions during non-human interactions. The research of Li et al (2021) reveals that recruiters generally do not perceive AI-enabled software as a threat, but as another tool that can simplify the search process, which can be an advantage in a highly competitive hiring space.
Mock Interviews_edited.png
Technology Research: Facial Expression Recognition (FER)
Kyle Soares led technology research and development. He leveraged an open-source Python library used for sentiment analysis of images and videos (Source: https://pypi.org/project/fer/):
  • Multi-cascade convolution neural network (MTCNN) model
  • Dataset - FER 2013, Pierre Luc Carrier and Aaron Courville, ICML: Challenges in Representation Learning
  • Analyzes videos per frame for 7 emotions: Neutral, Happy, Sad, Fear, Surprise, Anger, Disgust
​
Before running the mock interview sessions with participants, we conducted test analysis on video samples of ourselves to make sure the EAI software worked correctly and to collect sample data to work out how we would approach the data visualizations.
Risks to Participants.png
Research Ethics

Research ethics was led by myself. Three out of four team members completed the Institutional Review Board (IRB) training for social and behavioral research on human subjects. We learned about code of ethics, federal regulations, informed consent, privacy, and confidentiality. Based on those standards, we identified possible risks to our participants during the planning stage: 

  • Sharing of personally identifiable information

  • Video recording of participants

  • Sharing of personal experiences / feeling vulnerable

  • Intentionally causing participants to experience anxiety during interview

  • Possible feelings of anxiety during AI feedback portion

EAI-Data-Cleaning.png
Screening for Recruitment​
We created a screener survey with questions designed to obtain a relevant and equitable sampling to meet our participant quota. We had 12 questions covering interest/availability, demographic information, job seeking status and familiarity with AI technology. Out of 32 respondents, 5 did not pass the screener, leaving us with 27 viable respondents.
Data Cleaning and Sample Selections
  • We reconciled for one survey question that changed after the survey was released. 
  • We added a column to summarize "two or more race/ethnicity" (Yes/No)
  • We prioritized sampling by 4 attributes:
    • Age (because we had very few people age 35 and older)
    • Skin tone (because this was a major point of interest in the study)
    • Gender identification (because we wanted an even distribution)
    • Familiarity with AI (because we had a good range)
  • We shortlisted 12 participants with 4 alternates
  • 1 backed out
  • 2 no responses
  • 9 participants total were interviewed

We asked 3 behavioral questions designed to elicit neutrality (baseline), confidence and stress

Slide15_edited.jpg
Slide16_edited.jpg
Slide18_edited.jpg
Slide19_edited.jpg
Data Pre-processing and Visualization of Results
Dhanny Indrakusuma led data pre-processing and visualization. The following summarizes her process: 
  • Analyzed the output data from the interviews (CSV format with timestamps and emotion analysis output)
  • Merged model's output data with information from screener survey
  • Generated 120 unique charts in total
  • Generated insights by looking at how each participant's emotion differs when analyzing emotions for each question and demographic comparison

© Vanessa Sanchez 2024   |   Made with 💜 + ☕ + WIX

bottom of page