If you’re conducting survey research, chances are you’re always under pressure to analyze the data and report the results increasingly as quickly and cost-effectively as possible.
Whilst many aspects of survey research can now be handled with great efficiency and low cost, analyzing verbatim responses from open-ended questions presents a problem. This is one aspect of a research project that often causes a lot of friction, delay and cost. The answer is often to compromise on the quality of analysis by using quick, but superficial methods, or to just accept that cost, drag and delay are part of the process.
In this blog we’ll outline some of the ways you can tackle this friction, to improve the quality of your analysis and reduce the amount of effort and cost required to get there.
The process of coding open-ended responses will be fiddly, time consuming and error prone if you use software that is not up to the task. It’s very common to make use of what’s to hand and “make do” with Excel. In all but the simplest cases this is a very bad approach, for several reasons. Whilst many data collection tools have an in-built tagging or ‘coding’ tool, they are usually quite simplistic and therefore time-consuming to use.
A dedicated coding tool, on the other hand, will be packed with features to help you with the task of coding. Features such as sorting, filtering, searching etc... will make you much more efficient when working with verbatim data.
A good coding tool will also offer a different display depending on the type of data you’re working with. For example, when coding short repetitive text, like brands, the ideal layout should group repeated items together and place similar answers together for easy coding and to enable you to create a codeframe quickly from the highest mentioned answers.
Similarly, when coding longer verbatim answers, you want your interface to allow easy viewing of the whole verbatim and all codes applied alongside the codeframe for easy reference. It should enable coding to be done one-by-one but with minimum keystrokes and maximum efficiency.
It’s very common for the coding process to begin with an export of data from a survey platform into a flat file. This file is then passed to a coder or coding provider for them to work on or upload into a coding tool. Often, this process is repeated as interim files are produced incrementally in parallel with fieldwork. All of which adds up to a huge amount of manual, error prone steps which causes friction and delay.
A better approach is to pull data directly from an API provided by the survey platform. A good platform will also provide an API which you can use to push the coded data back into the survey.
All worthwhile survey platforms will provide an API you can connect with, and there is an increasing movement looking to standardise API access which will greatly increase the interconnectivity of survey platforms and external tools.
A “hands free” direct API connection will bring time savings by not having to repeatedly export and import manually, but also massively reduces the scope for human error in the process.
Another source of friction is the process of developing a codeframe - i.e. a list of themes that the data can be tagged with. With generative AI (e.g. ChatGPT) becoming mainstream this year, the scope for automated assistance here has greatly increased. It is now possible for coding tools to automatically generate a suggested codeframe for a set of verbatims and at least partially code the verbatims against that codeframe. It’s important to remember that these tools are powerful, but not perfect, so the output will need to be reviewed and refined by a person. This kind of human-AI interaction requires a user interface that allows you to work efficiently with the data. This is another reason why you need to use a dedicated coding tool that provides good support for this kind of interaction.
It is common for certain phrases to be repeated within a survey, either within a single ad-hoc study, or within multiple waves of an on-going survey. For example, verbatim comments like, “good service”, “cheaper prices” or “nothing” may come up repeatedly. For maximum efficiency your coding tool should be capable of spotting these repetitions and autocoding them by matching against previous examples. Similarly, a coding tool should be capable of spotting small typos and autocoding variations of the same text. For example, a coding platform should be capable of spotting that “coca-cola” and “coca-cloa” are just simple misspellings of the same thing and autocoding on that basis.
Larger scale surveys or tracking studies will contain considerable amounts of coded data which can be leveraged using machine learning. A machine learning model can be created and used to autocode verbatims once trained. This will result in a much higher quantity of autocoding than is possible by simply looking for repeated text (see above). For example, if you code some example verbatims like “I find it relaxing” and “It is a relaxing experience” then the machine learning should be capable of inferring that a phrase like “It was a calming place” is expressing the same theme an automatically code it in the same way. A good coding tool should be capable of making these kind of judgements and inferences based on the examples you show it and speed up your project in the process by autocoding a good proportion of it.
Too often, open ended questions in surveys are viewed with fear and trepidation because the verbatim data they generate are perceived as too much hassle to deal with.
This can often lead to researchers resorting to sub-standard methods of analysis which don’t tell the full story from the data.
However, most of the perceived hassle is driven by the unnecessary friction built into the process by not using the correct tools for the job. By using a dedicated coding tool, with the right set of features, so much of the friction can be removed from the process.
This should lead to a greater willingness to use open-ended questions in surveys and generate richer and more useful analysis as a result.
Try codeit for free
We will not share your information with any third parties
Try it for Free
Anything we can help you with? Ask us