Analysing Verbatims with Wordclouds

Effective, Pragmatic, or Mortal Sin?
Analysing Verbatims with Wordclouds
Blog

Looking at the use of Word Clouds as a means of analysing verbatim comments in Market Research surveys.

Tim Brandwood
10 Jun 2020

Introduction

If you've worked in Market Research for any length of time, you will have definitely run into the infamous "Wordcloud" at some point. They provide a quick, simple and cheap way to crunch verbatim data from surveys. They're not universally liked though and often come in for some heavy criticism. So, in this blog we thought we'd take a balanced look at Wordclouds to see if we can put their use into perspective and provide some advice on what role they can play in the research process.

Beware the Dogma

It's common to read strong opinions about Wordclouds - they can be written off as "over simplistic", "fluff", or even "harmful". However, it's always healthy to beware of sweeping generalisations. As with so many things in Market Research (and life!), "it depends". 

There is actually no standard "right" or "wrong" way to analyse verbatim responses from a survey.  In the real-world, there is always an interplay of complicated factors at work. For example, how much time have you got? how complex is the subject matter? what are the implications of making the wrong decision, and so on. 

The job of a researcher is to use data to understand consumer opinion, to answer a specific client research brief. So, ultimately, any analysis technique that achieves that aim is on the table. The important thing is to use techniques appropriate to the task at hand.

The Analysis Continuum

One way to think of verbatim analysis is as a continuum, ranging from "High Level" analysis at one end and "Deep Level" analysis at the other.

A high-level analysis is surface level, cursory and indicative whereas deep-level analysis is thorough, painstaking and exhaustive.

Clearly, Wordclouds sit at the high-level end of this analysis spectrum, but sometimes that's just fine. When Wordclouds are criticised, it's usually not the Wordcloud that is at fault. More likely, someone has used a Wordcloud to perform a high-level analysis of data that actually needs a more deep-level technique. For example, many blog articles have been written showing that Wordclouds do a poor job of capturing the true essence of, say, the Gettysburg Address or the Declaration of Independence. Duh!

High Level Analysis of Verbatims

So, when is it appropriate to take this kind of high-level approach when analysing verbatims from a Market Research survey?

Again, let's be wary of hard and fast rules, but here's a few pointers:

When you only want the top theme(s)

Often Wordclouds are criticised because they hide some of the nuance and detail in the data. However, if you're not interested in that nuance or detail then that's not a problem. For example, if the question you're trying to answer is: "what's the main thing people like about this new ice cream?" then it's highly likely that answer will come through via a word cloud.

When it's an initial data discovery stage

Perhaps you just want to get a general initial steer from your data? Maybe you want to share this with your client before embarking on a more detailed analysis. Using our software, Codeit, you can make this your start point and seamlessly move along the continuum, stopping when the results are "good enough".

When the vocabulary is limited

The fewer distinct words your verbatims contain, the better. For example, a Wordcloud built using brand mentions will perform well, because the range of words used will be quite narrow. Survey questions that elicit a wider range of words will prove more problematic.

When the subject is simple

If the subject matter can be expressed simply by respondents, then it's more likely it can be summarised simply in a Wordcloud. For example, people can express quite simply the reasons why they prefer the flavour of Product A over Product B, but explaining their opinions on, say, Brexit is far less straightforward.

Example

To illustrate a reasonable use of a Wordcloud, we took some of our own data generated from a survey where we asked people if they prefer watching films at home or the cinema [We should point out that this data was collected before the Covid Lockdown when we still had the choice!]. Of those who prefer to watch films at home, we asked why they have that preference.

The results are summarised in this Wordcloud: