If you carry out a survey, you’re going to collect some raw data. That data is going to be nice and simple, and easy to work with, right? Well, if your survey is simple, then possibly, yes.
However, real-world surveys often use techniques that generate raw data which is surprisingly awkward and difficult to handle.
Often, people (developers, analysts, researchers, data processors) come into the survey industry and underestimate these awkward features. They’re often surprised to find that they cause a surprising amount of pain, considering we’re “just” dealing with some simple survey data.
Here, we outline the main challenges we have had to overcome when handling survey data for coding.
The fundamental problem we’re talking about here boils down to something developers call “impedance mismatch”. This is a term that means, simply, that the tools you’re using are the wrong shape for the task at hand. In the case of market research data, the tools that often get used (csv file, Excel file, relational database) are awkward because we end up cajoling data into rows and columns when it is not inherently shaped as rows and columns in the first place.
To illustrate, let's look at the main culprits that we deal with day to day...
If you’ve ever filled out an online survey, chances are you’ve come across a question something like this: Please tick which of these statements apply to the following brands
The raw data generated by this kind of question is 2-dimensional and is therefore awkward to store in a single column in Excel.
Although a slightly more “exotic” feature, some survey platforms will allow you to repeat a question in a survey multiple times driven by a given list of elements. For example, you might ask: Q1. Which of these financial products do you own?
Then, you might ask, for each product selected at Q1: Q2. How would you rate your current [product] provider?
Again, this kind of question structure will generate data that doesn’t fit neatly into the rows and columns of an Excel file.
As a further example, imagine that our “looped” question, Q2, was a grid question (see above) rather than a simple single select question. That will result in a 2-dimensional grid of data, which is then repeated multiple times – so effectively 3-dimensional data which obviously will be messy to represent in a simple Excel file.
In a “closed-end question” (i.e. where the respondent picks from a fixed list of options), it is common to also include an “Other” option. This option allows the respondent to specify an “Other” response, which is not included in the pre-defined list. For example: Q1. Which is your favourite chocolate bar?
Including an “Other” option is useful when it’s impossible to define a complete pre-defined list of options that might apply.
This kind of question is especially interesting because it yields both closed-end selections (options 1 to 4) and, in some cases, an additional free-text response (the typed in specified response).
In addition, it’s also possible for a question to have multiple “Other” options which means the data generated is even more complicated and difficult to handle in a simple format such as Excel.
If we ask a simple closed end question and force the respondent to pick one option (e.g. “Which is your favourite chocolate bar?”) then this yields a simple single response for each person completing the survey. But what if we allow the respondent to pick multiple responses (e.g. “Which of these chocolate bars do you like?”). Now we have a single question that yields multiple responses. How can we store that kind of data in an Excel spreadsheet? Do we store all the values in one column? Do we create a separate column for each response given? There’s no right or wrong answer here, just a set of options – all of which are complicated to handle in their own way.
As a final thought experiment, imagine a multi response question, in a grid wrapped within a loop. That’s a complicated set of data to try and store in a humble Excel spreadsheet. And this isn’t a contrived example either – this is the kind of real-world data that we regularly have to handle within codeit.
At codeit, our main interest is in analyzing and coding verbatim responses as efficiently and accurately as possible. As such, we need to handle the challenges above as smoothly and effortlessly as possible. When your data structure contains these kinds of complexities, you need a dedicated tool that can handle these natively and easily. Here are some of the ways we have tackled these challenges in codeit.
Despite the awkwardness of storing survey data in a flat file format (e.g. Excel or csv) often this is the only option if you want to export data from some survey platforms.
codeit has been designed to be aware of common export layouts produced by most survey platforms. So, when you’re importing data containing some of the more complicated structures above, codeit is often able to automatically join the dots and group fields together on import.
For example, if your file contains multiple columns because your data is looped, or multi-response, then codeit can intelligently and automatically group those columns together into one entity when importing your file.
The “Data Links” feature within codeit allows you to connect codeit directly to your survey platform. This approach neatly avoids most of the problems above. Namely, awkwardly shoehorning survey data into a simple row/column format such as Excel becomes unnecessary if codeit can connect directly to your survey platform – no intermediate file export is required. When you connect to a survey using codeit’s data links you can pick the questions you want to import and codeit will figure out the details. If, for example, your question is a grid, a loop or an other specify, codeit will work out what to do and deal with the complexities seamlessly behind the scenes.
A direct connection between survey tools and codeit clearly makes life easier for everyone. But unfortunately, not all survey systems are created equally. This leads to subtle differences from one survey platform to another, which makes the vision of a direct connection a bit more painful in practice. Because of this, we’re involved in an initiative (TSAPI), which seeks to standardise the way that survey platforms interconnect. The curveballs above will still exist, but if we can find standard ways to handle these, then they cease being major pain points for everyone.
In short, yes. We’ve described how handling market research data can be daunting and painful when processing and shipping data between platforms.
The good news is that decent tools, such as codeit, are able to manage and minimize the friction involved in these processes.
With 20+ years’ experience in the industry, codeit understands Market Research data in all its guises and deftly handles the challenges described above. With this wealth of knowledge, codeit makes it easy to handle your data, either using the import/export functions or via our datalinks, meaning the user has a seamless experience - dispatching the data dragons with ease.
Certainly, there are other challenges that we’re not listing here – if you have a particular challenge that we’ve not covered, get in touch – we'd like to share your pain!
Get in touch with codeit's seasoned team to let them sort your perilous data today.
We will not share your information with any third parties
Try it for Free
Anything we can help you with? Ask us