Professor Xiaojing Dong Leverages AI-Driven Large Language Models to Bring Better Nuance and Increased Understanding to Marketing Analysis
Nowadays, consumer data is considered one of the most valuable commodities, and while it can be used across multiple industries, it is arguably the most valuable in advertising and marketing. Data helps advertisers and marketers understand consumer behavior, preferences, and trends, which in turn, helps to guide more strategic and impactful marketing efforts.
“In marketing, the ultimate goal is to understand people’s individual preferences, so collecting and analyzing consumer data is a practice marketers have been doing for decades,” explains Xiaojing Dong, Associate Professor of Marketing & Business Analytics. “Historically, we have only had access to what consumers purchased and returned in store, but, with the rise of online shopping, we have even more data about consumers. We’re able to see what products a consumer clicked on but didn’t purchase, what articles or reviews they read before deciding what to purchase, their search history, and of course, what they ultimately purchased and if they kept or returned the product.”
With access to an individual’s online data, marketers are able to paint a very clear picture of each consumer; however traditional methods of data labeling and analysis are costly, time-consuming, and susceptible to bias. Professor Dong’s upcoming research Multi-Hierarchical Labeling for Long and Unstructured Contents Using LLMs, addresses the challenges in traditional methods and offers a scalable solution leveraging large language models (LLMs), a type of artificial intelligence (AI) that uses deep learning algorithms to process and understand vast amounts of text data.
Much like the dot com boom of the early 2000’s, the introduction of AI is today similarly reshaping various aspects of business and society in profound ways, including the ability to quickly and more comprehensively and accurately analyze data through the use of LLMs.
Dong and her co-authors have introduced a novel LLM-based approach for creating data labels on long and complex social media content by first identifying common challenges typically faced in data labeling, which include social media posts of different lengths, e-commerce page information, live stream data, as well as content requiring nuanced inference and logical deductions. Then, they tested the LLM framework on a large-scale dataset of 2 million multi-source content pieces from 300 users across seven different apps.
The approach demonstrated superior performance across all challenges and scenarios, achieving over 95% accuracy and surpassing the 88% accuracy rate of traditional methods, while also reducing the cost to less than $1 per 10,000 pieces of content, which is one ten-thousandth of the cost of human labor.
“The study not only offers an adaptable and accurate method for data labeling, but also provides deep insights into user preferences across multiple platforms, paving the way for advanced product recommendation engines and future AI-driven data analysis advancements,” boasts Dong.
While this type of cutting edge technology and research is paramount for helping marketers get to know their customers, target them more accurately, and ultimately make more money, it is also a reminder that with great power comes great responsibility.
Consumers are becoming more cautious and knowledgeable about their data being used against them. As digital interactions and data collection continue to expand, and with data breaches threatening the exposure or misuse of Social Security numbers, medical records, and financial information, safeguarding personal information remains a critical and ongoing responsibility for individuals and organizations alike.
As this widespread data privacy sensitivity continues to increase, Dong’s future research aims to explore how marketers can better understand and leverage consumer preferences without tapping into a consumer’s private data.