Lab 07: Qualitative Data Analysis

Learning Goals

Learn how to collaboratively perform a thematic analysis on interview transcripts.
Understand and implement steps for open/axial coding, codebook development, and inter-coder reliability testing.
Gain practical experience using coding as a tool for identifying patterns in human-robot interaction (HRI) research data.

Working in Groups

For this lab, you will work in groups of ~3 students. Each group will turn in ONE set of deliverables.

Lab 7 Deliverables & Submission

Lab 7 introduces the basics of thematic analysis, adapted from Richards & Hemphill (2018), one kind of qualitative data analysis. You will work in groups to analyze anonymized interview transcripts (lab_07_interview_data from this Google Drive link), identify recurring themes, develop and refine a shared codebook, and apply that codebook to additional data. You will calculate inter-coder reliability and also write up your results as if you were writing a research paper.

You will be asked to submit:

The Lab 7 Qualitative Data Analysis Worksheet - the main worksheet for this lab where you will report:
- Your codebook
- Your data analysis approach + inter-rater reliability score
- A writeup of the results of your thematic analysis, similar to what we might see in the "Results" section of an HRI research paper
A Spreadsheet of Thematic Analysis/Coding on Each Anonymized Interview Transcripts – include themes coded for each transcript

To receive credit for this lab, one of the members of your group will need to submit your completed quantitative data analysis worksheet to Canvas by Friday, May 9 at 6:00pm.

Lab 7 HRI Study, Data, and Your Goal for this Lab

During this lab, you'll be analyzing interview data from the same HRI study we examined during Lab 6. In case it's helpful, here's the study overview again, so you can remind yourself about the study hypotheses, methods, and measures.

To access the data file for Lab 7 lab_07_interview_data, you can download it using this Google Drive link.

Your goal for this lab is conduct a thematic analysis on the interview transcripts and report your data analysis methods and findings. The outcome of this lab will be a written report that resembles a qualitative "Results" section of the papers we've read in class.

Steps for Thematic Analysis

Phase One: Preparing for the Analysis

This phase involves understanding the context and goals of your analysis. Your goals for this analysis include:

Collecting evidence for or against the experiment hypotheses (for review, you can look over the study overview again)
Understanding participants' perceptions of the overall experience
Understanding participants' perceptions of the robot

You are not required to pursue all three of these goals, these are meant to serve as starting points and guidance for your analysis.

Phase Two: Open and Axial Coding

Step 1: Each team member reads the first 6 transcripts (representing one pair of participants from each condition) and identifies initial themes and subthemes. Some example themes and subthemes could be:
- Example theme: overall opinion of the robot
  - Example subtheme: positive
  - Example subtheme: negative
  - Example subtheme: neutral
- Example theme: perspective on collaborating with the other human participant
  - Example subtheme: other participant was too dominant
  - Example subtheme: equal and enjoyable collaboration
  - Example subtheme: other participant was disengaged
Step 2: As a team, discuss and refine your themes iteratively. For the purposes of this lab, please select 2 themes to code, each of which can have 2-4 subthemes. Your group should review at least 12 transcripts (2 pairs of participants from each of the 3 conditions: positive-positive - PP, negative-negative - NN, positive-negative - PN) to refine your themes. Richards & Hemphill (2018) recommend reviewing around 30% of the data during this phase for a full-fledged thematic analysis.

Phase Three: Preliminary Codebook

Step 1: Create a preliminary codebook based on the discussion above (put your codebook in your Lab 7 Qualitative Data Analysis Worksheet).
Step 2: The team reviews the draft together. [Optional] You may invite an external researcher familiar with the study (but not part of the coding process) to review it as well.

Phase Four: Pilot Testing the Codebook

Step 1: All team members independently code the same 2–3 new transcripts using the codebook (e.g., in separate tabs of a Google spreadsheet).
Step 2: Discuss discrepancies and revise the codebook accordingly until your team is confident in it.

Phase Five: Final Coding and Inter-Coder Reliability

Step 1: Perform full coding on the dataset using either consensus coding or split coding. For HRI research, we typically use split coding. This means that all team members code an overlap set (typically consisting of ~10% of the data). After a sufficient inter-rater reliability is achieved on the overlap set (see Step 2), the team members split up the rest of the data, where only one team member codes each of the remaining transcripts.
Step 2: Calculate inter-coder reliability on the transcripts coded by all team members, using the appropriate metric below:

Choosing a Inter-Coder Reliability Metric:

2 coders, mutually exclusive themes: Use Cohen’s Kappa
2+ coders, mutually exclusive themes: Use Krippendorff’s Alpha or Fleiss’ Kappa
2 coders, non-mutually exclusive themes: Use Cohen’s Kappa per theme (binary coding)
2+ coders, non-mutually exclusive themes: Use Krippendorff’s Alpha (nominal)

You can download cohen_kappa.py, krippendorff_alpha.py, and fleiss_kappa.py from the Lab 7 GitHub repository. You may need to install dependencies via pip install scikit-learn statsmodels krippendorff.

Phase Six: Interpreting and Writing Results

This phase involves drawing conclusions based on your thematic analysis and writing up your results. Rather than prescribing how to write these sections, we recommend that you review good examples of thematic analyses from published HRI papers and emulate the best of what you see from them:

Carsenti et al. (2025) - Section VF "Qualitative: Semi-structured interviews"
Nanavati et al. (2023) - Section 6 "Interview Results"
Hu et al. (2025) - Section VB "Place-Attachment Experience"
Shen et al. (2025) - Section IV "Results"

Tips & Resources

Use color coding or margin notes while reviewing transcripts to help identify themes.
When building the codebook, be specific about the criteria for each theme.
Consider using a spreadsheet to compare coded transcripts side-by-side.
Python scripts to calculate inter-rater reliability will be provided.
Refer to the original article for examples and guidance: Richards & Hemphill (2018) .

Extra Challenge

Compare how themes shift across different experimental conditions by visualization.