FieldworkSign in

Learn · fundamentals

What is thematic analysis?

Thematic analysis is a method for identifying, analysing, and reporting patterns of meaning across qualitative data. Applied to interview transcripts, it is the process of reading what participants said, identifying recurring ideas and patterns, organising those patterns into themes, and using those themes to answer the research question. It is the most widely used approach to qualitative analysis and the starting point for most researchers working with interview data.


Why thematic analysis is the default method

Qualitative analysis methods exist on a spectrum from highly structured (content analysis, which counts occurrences of predefined categories) to highly interpretive (discourse analysis, which examines how language constructs meaning and identity). Thematic analysis sits in the middle: structured enough to be systematic and reproducible, flexible enough to work across different research questions and theoretical frameworks.

That flexibility is why it dominates applied research. A UX researcher analysing onboarding interviews, a research agency synthesising customer journey data, and an academic studying lived experience can all use thematic analysis. The core process is the same. The depth of interpretation varies based on the research question and the analyst's theoretical stance.

For most applied product and UX research, thematic analysis is the appropriate choice. It produces findings that are grounded in the data, comprehensible to non-researcher stakeholders, and actionable enough to inform design and product decisions.


The core process

Thematic analysis is not a single step. It is a sequence of moves through the data, each building on the previous one.

Familiarisation. Read the transcripts. All of them. Not to find themes yet. Read to understand what's in the data before you start organising it. Researchers who skip this step and go straight to coding produce shallower analysis because they haven't let the data shape their thinking before they start imposing structure on it.

Initial coding. Go through the transcripts and label meaningful segments. A code is a short phrase that captures what a particular segment is about. One segment can receive multiple codes. At this stage, generate codes freely. You are not deciding which ones matter yet. You're creating a vocabulary for the data.

Searching for themes. Group related codes together. A theme is a pattern of meaning that captures something important about the data in relation to the research question. Not every code becomes part of a theme. Some codes are interesting but peripheral. Some belong together in ways that weren't obvious during initial coding.

Reviewing themes. Test the themes against the data. Does the theme hold up when you read all the coded segments it contains? Does it tell a coherent story? Are there segments coded elsewhere that actually belong here? This is the phase where themes get split, merged, renamed, or discarded.

Defining and naming themes. Write a clear definition for each theme: what it captures, how it relates to the research question, and how it is distinct from other themes. The name should describe the theme's essence, not just label a topic. "Uncertainty at verification" is a better theme name than "verification step."

Writing up. Present the themes with evidence: quotes from participants, patterns across the data, and the analytical interpretation that connects the evidence to the research question. The write-up is where analysis becomes insight.


What makes thematic analysis rigorous

Thematic analysis is sometimes criticised for being subjective: two analysts reading the same transcripts might produce different themes. That's true, but subjectivity is not the same as unreliability. Two themes produced by different analysts from the same data should both be defensible, even if they're not identical.

Rigour in thematic analysis comes from three practices.

Grounding themes in evidence. Every theme should be supported by multiple quotes from multiple participants. A theme that rests on one striking quote from one participant is not a theme. It is an interesting data point.

Maintaining an audit trail. Keep your initial codes and the decisions you made about grouping them. This allows someone else to follow your analytical process and evaluate whether your themes are well-founded.

Staying close to the research question. Themes should answer the research question, not just describe interesting things participants said. Analysis that produces themes unrelated to the original question has drifted. The question is the anchor.


Semantic versus latent themes

Thematic analysis can operate at two levels, and understanding the difference matters for how you approach analysis.

Semantic themes describe what participants explicitly said. If multiple participants described feeling uncertain during document upload, a semantic theme captures that: "uncertainty at document upload." The theme describes the surface content of what was said.

Latent themes describe what the data implies beneath the surface. If participants describe uncertainty at document upload, but the underlying pattern across the data is that participants don't trust that the company will handle their documents securely, the latent theme is "distrust of data handling." That's a different finding and a more actionable one.

Applied research typically works at the semantic level with moves toward the latent when the data supports it. Academic research tends to operate more consistently at the latent level. The right level depends on what the research question is actually asking.


Common mistakes

Treating themes as topic summaries. A theme is not "participants talked about onboarding." That is a topic. A theme captures a pattern of meaning: what participants said about onboarding, what it means, and how it relates to the research question.

Too many themes. A thematic analysis that produces twelve themes has probably produced a list of topics rather than a set of meaningful patterns. Most well-scoped studies produce three to six themes. If you have more, look for higher-order patterns that organise the smaller ones.

Themes that overlap significantly. If two themes contain much of the same coded material, they may be sub-themes of a single larger theme, or one may be a manifestation of the other. Distinct themes should capture meaningfully different patterns.

Forcing the data to fit predetermined categories. If you approached analysis with a hypothesis about what you'd find and organised your themes around confirming it, you have conducted confirmatory analysis, not thematic analysis. Themes should emerge from the data, not be imposed on it.


What this looks like in practice

A research team has completed 14 interviews with users who abandoned a B2B onboarding flow at the document upload step. They have 14 transcripts and need to produce findings for a design review in four days.

The lead researcher reads all 14 transcripts in a single session, making notes on recurring ideas without organising them yet. In the initial coding pass, 47 distinct codes emerge across the transcripts. Grouping those codes produces five candidate themes.

In the review phase, two of the five themes are found to overlap significantly: "fear of making a mistake" and "uncertainty about consequences" are both expressions of the same underlying pattern. They are merged into a single theme: "unclear consequences of error." Three distinct themes remain, each supported by quotes from at least eight of the fourteen participants.

The final themes: "unclear consequences of error," "distrust of document handling," and "mismatch between expected and actual effort." Each theme is defined, named, and supported by representative quotes. The design team has three specific findings to act on, each grounded in what participants actually said.


Frequently asked questions

How is thematic analysis different from content analysis?

Content analysis counts how often predefined categories appear in the data. It starts with categories and measures their frequency. Thematic analysis develops categories from the data itself and focuses on patterns of meaning rather than frequency of occurrence. Content analysis is more appropriate when you need quantifiable results from text data. Thematic analysis is more appropriate when you need to understand what the data means.

How many themes should a thematic analysis produce?

Most well-scoped studies produce between three and six themes. Fewer than three may indicate the analysis hasn't gone deep enough or the research question was too narrow. More than six often indicates the analysis has remained at the topic level rather than identifying higher-order patterns of meaning.

Can thematic analysis be done by someone who didn't conduct the interviews?

Yes, and in some research contexts it's preferable because the analyst approaches the data without the impressions formed during fieldwork. The risk is that nuances of tone, hesitation, and context that were visible in the live session are not always captured in a transcript. Ideally the analyst has either conducted the sessions or reviewed notes from the person who did.

How do you handle contradictory data in thematic analysis?

Contradictions are data. If some participants describe an experience positively and others describe it negatively, that divergence is meaningful and should be represented in the analysis, not resolved by majority vote. A theme can capture a pattern that is present in some participants and absent in others, as long as the analysis is transparent about that variation.

What software is best for thematic analysis?

Simple analyses can be done in a spreadsheet or a word processor. Dedicated qualitative analysis tools like NVivo, Atlas.ti, and Dovetail offer more structure for large datasets. The right tool depends on the volume of data, team size, and how the findings need to be shared. The software doesn't determine the quality of the analysis. The analytical rigour does.

How does AI change thematic analysis?

AI tools can surface initial patterns and candidate themes from large transcript sets faster than manual reading. They work best as a starting point for human analysis rather than a replacement for it. The human analyst's job shifts from reading every transcript line by line to evaluating AI-surfaced patterns, stress-testing them against the data, and making the interpretive judgments about what the patterns mean. The rigour requirements don't change. The starting point does.


Related on Fieldwork


Last updated: 2026-04-21

Related reading