When it comes to qualitative research, everyone always talks about coding, codes, categories, and themes. But what does this actually mean? This article will give you an introduction into coding qualitative data for categories and themes and help you understand these terms.
I will show you:
- the basic principle of coding qualitative data (Part 1)
- what a code or category is (Part 2)
- how to derive your first codes from qualitative data (Part 3)
- the different types of coding that exist (Part 4)
After reading this article, you will be well-prepared for your own qualitative research project. You can directly dive into qualitative content analysis, grounded theory, or any other qualitative method.
#1 What is Coding?
Coding refers to the process of assigning conceptual labels to data (Urquhart 2013). It primarily applies to qualitative data, such as text, images, videos, or audio.
For the sake of simplicity, let’s assume in this video that your data was collected through interviews and now exists in the form of interview transcripts.
When you assign a specific label to a specific set of data, you begin to analyze that data. A set of data could be, for example, a response to an interview question or even a random individual line in your transcript. Let´s continue with the introduction of coding qualitative data for categories and themes.
Why is Coding Useful?
Coding can help you summarize larger amounts of data. For example, imagine you conducted 20 interviews, resulting in a total of 200 pages of transcripts.
How can you compress the content of this data into a single results section of a scientific paper? That’s right – by summarizing it. In this case, various coding techniques offer a tool to do this systematically and comprehensibly.
However, coding not only allows you to summarize data but also to structure it. Structure can mean that your data is assigned to specific categories. These categories give meaning to the data. You’ll learn more about this in the second part of the tutorial.
Coding for Theory Building in Qualitative Research
New Constructs
However, coding in qualitative research can go much further than summarizing and structuring. Coding is actually one of the most important tools for developing new theories.
This brings us to the realm of Grounded Theory. You can find more information on that in other tutorials of mine. In addition to watching tutorials on Grounded Theory, I recommend watching the video on “What is a Theory?”
In summary, Grounded Theory is a methodology, not just a method. This means that you can combine different coding techniques to reach your goal: new theory.
This new theory consists of constructs and their relationships to each other. The necessary intermediate step to reach these components from qualitative data is coding.
The specific coding techniques employed and how they are combined depend on the recommendations or authors chosen for your own Grounded Theory study.
New Relationships
In the further course of a Grounded Theory study, determining the relationships between constructs that came out of your initial codes becomes important.
Here, some coding techniques are not only limited to transforming codes into concepts and constructs but also explaining a relationship or process. You can find more information on this in other tutorials, for example on what is referred to as axial coding.
#2 What is a Code or Category?
Simply put, a code is just a label we assign to a specific part of our data. A category is like a bucket in which you collect the codes that fit together.
In general, there is a distinction between descriptive and analytical codes.
Descriptive codes can, for example, adopt certain things such as signal words from the data. In this case, we use the language that appears in the data itself.
However, in most cases, it is better to move away from that quickly and choose your own words for the codes (Urquhart, 2013).
Analytical codes go beyond mere description and offer an interpretation of the data. This is ultimately what we aim for in almost every qualitative method.
The codes can have different levels of abstraction. The lower the level of abstraction, the more descriptive the codes. The higher the level of abstraction, the more analytical the codes.
The different levels of abstraction contribute to creating a kind of hierarchy between the codes. This is also referred to as a category system or data structure.
A category system could consist of a handful of main categories and about two or three times as many subcategories.
A data structure usually consists of first-order themes, second-order themes, and aggregate dimensions. Themes and dimensions are just fancy words for more developed and abstracted codes.
As you can see, the terminology for certain codes and their structures can vary depending on the method and authors. Always adopt the terminology from the methodological guide you are working with.
#3 An Example from an Interview Transcript
Let’s take a look at a simple example of the most basic form of coding, which is open coding.
For a study, I conducted interviews with companies that organize themselves remotely since their inception. Here is a response from an expert in an interview transcript:
“The process usually looks like this: Three people who are specifically interested in it make a proposal, and then it is informally voted on in Slack or during a meeting whether the proposal is good or not.
And it was the same with the rules. Three people who already had experience with such rules from other contexts worked it out in a Google document, and then it ended up in Slack. Different people added comments to the Google document about what they thought of each sentence. This went back and forth for about a week. And in the end, there was a draft that everyone liked, and that was it.”
With open coding, I can assign a code to each line or sentence. I would assign the code “Forming an Interest Group” to the sentence “The process usually looks like this: Three people who are specifically interested in it.”
The section “And in the end, there was a draft that everyone liked, and that was it” could be given the code “Collective Satisfaction.”
These two codes are at a very low level of abstraction because they are assigned to only one sentence or line each. They are also very descriptive and describe what was said.
To arrive at an analytical code, we can now summarize or interpret the codes assigned to each line. For the given response, this could lead to the code “Consensus Building.” This code interprets the data and summarizes in one word what the response was essentially about.
However, an analytical code does not necessarily have to consist of only a single word.
#4 What are different types of coding in qualitative research?
The examples of coding qualitative data for categories and themes we have looked at are typical for (1) inductive coding. This type of coding is primarily used in thematic analysis and, of course, grounded theory.
The process of inductive category development is bottom-up, meaning it goes from the data to the code.
In addition, there is also (2) deductive coding. This is characteristic of structured content analysis or quantitative content analysis. Here, based on existing theory, you establish a category system into which you simply sort your data.
The process of deductive category application is top-down, meaning it goes from theory to data.
However, there is often a grey area between the two. For simplicity, I call this (3) abductive coding. This means that the process of coding happens in a continuous interplay between theory and data. Abduction can encompass more than that, but for now, you don’t need to remember more. Some approaches in grounded theory embrace this approach in their recommendations. Just be careful not to mix recommendations from different authors too much.
And lastly, there is (4) thematic coding, as described by Braun and Clarke (2006). The difference from the other types is that broader codes (themes) are used to interpret larger chunks of data. It is less incremental than the other techniques.