Understanding AI and its different algorithms is critical for market researchers in today’s world
Damien Gouriet
While Generative AI has grabbed the spotlight recently, it is just one of many techniques within AI. Understanding AI and its different algorithms is critical for market researchers in today’s world. When used correctly, AI can greatly improve their work by helping them generate more accurate insights and make better decisions. However, using the wrong AI solutions can result in errors and missed opportunities.
Traditionally, machine learning models are trained using custom data specific to each task. Generative AI and Large Language Models (LLMs) have shifted this approach by leveraging vast amounts of online data, becoming the dominant AI tools today. Does this render traditional AI algorithms obsolete? What limitations and challenges do Generative AI and LLMs face with verbatim data, and how can Custom Machine Learning models address these issues?
AI encompasses a broad range of technologies designed to mimic human intelligence. From random forest to neural networks, AI aims to solve complex tasks by analyzing data and making decisions based on it. AI’s applications are vast and varied. For example, self-driving vehicles use radars, cameras, and GPS data to build autonomous systems. Another example is fraud detection in market research, where AI sifts through data to spot anomalies.
Generative AI refers to a subset of artificial intelligence designed to create new content, such as text, images, audio, or video, by learning patterns from existing data. Generative AI models have been trained on a vast amount of data available on the Internet. Unlike traditional AI, which focuses on recognizing patterns and making decisions based on those patterns, generative AI produces new and unique outputs that mimic human creativity.
Large Language Models (LLMs) are Generative AI models specifically focused on understanding and generating human language. LLMs are designed to produce text outputs, making them incredibly useful for applications such as content generation, chatbots, and automated customer service.
Custom Machine Learning (ML) refers to the development of models specifically designed to address unique challenges within a particular industry. Unlike general-purpose models like LLMs, custom ML models are built with a deep understanding of a business's specific data, requirements, and objectives. This approach involves fine-tuning algorithms, datasets, and model architectures to optimize performance on targeted goals. Within machine learning, use cases are often separated into supervised problems such as text classification and unsupervised problems such as clustering. In codeit, the key distinction is that we train a machine learning model with project-specific coded data, while our generative AI tool utilizes GPT-3.5 in its standard, out-of-the-box form.
At codeit, we leverage both Generative AI and Custom Machine Learning to enhance our AI capabilities.
Through our themeit tool, we use Generative AI to produce initial codeframes and discover new themes in scenarios where no existing coded data exists or when a fresh perspective is needed. This is particularly useful for ad-hoc studies or projects with evolving requirements where a predefined codeframe is not available.
On the other hand, our Custom Machine Learning approach focuses on supervised learning tailored to specific projects with well-defined codeframes. For instance, in tracking studies with extensive existing coded data and codeframes, we train dedicated models to ensure coding consistency and accuracy over multiple waves of analysis.
Our methodology involves training separate models for each unique codeframe, ensuring precision in predicting codes relevant to the project at hand. By following our Extract - Refine - Apply process, we integrate both AI methods to get the best out of automation.
This combined approach allows us to harness the strengths of Generative AI for creative theme generation and initial coding insights, while Custom Machine Learning ensures robustness and accuracy in subsequent analyses. You can read this blog to learn more about our Extract - Refine - Apply process.
Recent studies, such as this one from Inca have highlighted the success and potential of large language models (LLMs) for verbatim coding, a task traditionally performed by human researchers. When evaluating these LLMs on coding tasks with real market research survey data, the models demonstrated good performance. However, there are still several challenges remaining, highlighting areas for improvement in AI-driven verbatim data analysis using Generative AI.
The primary issue is the accuracy rate. In our testing, Generative AI is around 70% accurate compared to manual coding. Our "Refine" step is needed to attain the precision required for verbatim coding. In contrast, custom ML models often achieve 80% to 90% accuracy, making them the more reliable choice for market researchers, especially for tracking studies.
Consistency is another problem. Generative AI outputs vary with each run, making each set of results non-reproducible. A Custom ML model on the other hand will always produce the same results once trained.
Finally Generative AI struggles with very short responses. Verbatims with a couple of sentences are usually analyzed better as there is more context for the Large Language Models to use. We can overcome this challenge by combining different AI techniques.
While Generative AI is incredibly useful for a first read of the data and creating codeframes, human checking and Custom Machine Learning models trained on each project become necessary as a second pass to ensure consistency and accuracy over time.
Looking ahead, it's worth considering whether traditional coding methods will remain necessary. With the development of advanced tools such as Retrieval-Augmented Generation (RAG), professionals have questioned the need for traditional coding. RAG models combine the strengths of retrieval-based methods with generative capabilities, offering a more comprehensive approach to data analysis. This could potentially eliminate the need of the coding process altogether, allowing researchers to focus on interpreting results rather than manually coding data. We’ll discuss this more in depth in an upcoming blog.
At codeit, we handle all the complexities behind the scenes to deliver a seamless AI experience for analyzing verbatim data. We believe in continual advancement to ensure our clients benefit from the most effective AI tools available.
Partner with our team so you can focus on the bigger-picture solutions without worrying about the technical details. Get in touch today.
We will not share your information with any third parties
Try it for Free
Anything we can help you with? Ask us