Date
|
Speaker
|
Topic
|
Faculty Host
|
11/8/2024
Melcher Hall 365A
11:00 AM - 12:30 PM
|
Behnam Mohammadi
Carnegie Mellon University
|
''Creativity Has Left the Chat: The Price of Debiasing Language Models'' and ''Wait, It’s All Token Noise? Always Has Been:Interpreting LLM Behavior Using Shapley Value''
-
Click to read Abstract
Large Language Models (LLMs) have revolutionized natural language processing but can exhibit biases and may generate toxic content. While alignment techniques like Reinforcement Learning from Human Feedback (RLHF) reduce these issues, their impact on creativity, defined as syntactic and semantic diversity, remains unexplored. We investigate the unintended
consequences of RLHF on the creativity of LLMs through three experiments focusing on the Llama-2 series. Our findings reveal that aligned models exhibit lower entropy in token predictions, form distinct clusters in the embedding space, and gravitate towards ''attractor states'', indicating
limited output diversity. Our findings have significant implications for marketers who rely on LLMs for creative tasks such as copywriting, ad creation, and customer persona generation. The trade-off between consistency and creativity in aligned models should be carefully considered when selecting the appropriate model for a given application. We also discuss the importance of prompt engineering in harnessing the creative potential of base models.
-
The emergence of large language models (LLMs) has opened up exciting possibilities for simulating human behavior and cognitive processes, with potential applications in various domains, including marketing research and consumer behavior analysis. However, the validity of utilizing LLMs as stand-ins for human subjects remains uncertain due to glaring divergences that suggest fundamentally different underlying processes at play and the sensitivity of LLM responses to prompt variations. This paper presents a novel approach based on Shapley values from cooperative game theory to interpret LLM behavior and quantify the relative contribution of each prompt component to the model’s output. Through two applications—a discrete choice experiment and an investigation of cognitive biases—we demonstrate how the Shapley value method can uncover what we term ''token noise'' effects, a phenomenon where LLM decisions are disproportionately influenced by tokens providing minimal informative content. This phenomenon raises concerns about the robustness and generalizability of insights obtained from LLMs in the context of human behavior simulation. Our model-agnostic approach extends its utility to proprietary LLMs, providing a valuable tool for marketers and researchers to strategically optimize prompts and mitigate apparent cognitive biases. Our findings underscore the need for a more nuanced understanding of the factors driving LLM responses before relying on them as substitutes for human subjects in research settings. We emphasize the importance of researchers reporting results conditioned on specific prompt templates and exercising caution when drawing parallels between human behavior and LLMs.
|
Ye Hu
|
11/1/2024
Melcher Hall 365A
12:30 PM - 2:00 PM
|
Kevin Lee
University of Chicago
|
Generative Brand Choice
-
Click to read Abstract
Estimating consumer preferences for new products in the absence of historical data is an important but challenging problem in marketing, especially in product categories where brand is a key driver of choice. In these settings, measurable product attributes do not explain choice patterns well, which makes questions like predicting sales and identifying target markets for a new product intractable. To address this ''new product introduction problem,'' I develop a scalable framework that enriches structural demand models with large language models (LLMs) to predict how consumers would value new brands. After estimating brand preferences from choice data using a structural model, I use an LLM to generate predictions of these brand utilities from text descriptions of the brand and consumer. My main result is that LLMs attain unprecedented performance at predicting preferences for brands excluded from the training sample. Conventional models based on text embeddings return predictions that are essentially uncorrelated with the actual utilities. In comparison, my LLM-based model attains a 30% lower mean squared error and a correlation of 0.52 i.e. for the first time, informative predictions can be made for consumer preferences of new brands. I also show how to combine causal estimates of the price effect obtained via instrumental variables methods with these LLM predictions to enable pricing-related counterfactuals. Combining the powerful generalization abilities of LLMs with principled economic modeling, my framework enables counterfactual predictions that flexibly accommodate consumer heterogeneity and take into account economic effects like substitution by consumers and price adjustments by firms. Consequently, the framework is useful for downstream decisions like optimizing the positioning and pricing of a new product and identifying promising target markets. More broadly, these results illustrate how new kinds of questions can be answered by using the capabilities of modern LLMs to systematically combine the richness of qualitative data with the precision of quantitative data.
|
Ye Hu
|
10/25/2024
|
Fei Teng
Yale University
|
Honest Ratings Aren't Enough: How Rater Mix Variation Impacts Suppliers and Hurts Platforms
-
Click to read Abstract
Customer reviews and ratings are critical for the success of online platforms in that they help consumers makechoices by reducing uncertainty and motivate supplier (worker) incentives. Existing literature has shown that rating systems face problems primarily due to fake or discriminatory reviews. However, customers also differ in their rating styles: some are generous and others are harsh. In this paper, we introduce the novel idea: even if raters are honest and unbiased, differences in the early rater mix (of generous and harsh raters) for a supplier can lead to biased ratings and unfair outcomes for suppliers. This is because platforms display past ratings to customers whose own ratings and acceptance of suppliers are impacted by it and platform uses the past ratings for its prioritization and recommendations. These lead to the path dependence. Using data from a gig-economy platform, we estimate a structural model to analyze how early ratings affect longterm worker ratings and earnings. Our findings reveal that early ratings significantly impact future ratings leading to persistent advantages for early lucky workers and disadvantages for unlucky ones. Further, the use of these ratings in the platform's prioritization algorithms magnify these effects. We propose a neutral adjusted rating metric that can mitigate these effects. Counterfactuals show that using the metric enhances the accuracy of rating systems for customers, fairness in earnings for workers, and better retention of high quality workers for the platform. The resulting supplier turnover can lead to lower quality supplier mix on platforms.
|
Bowen Luo
|
10/18/2024
Melcher Hall 365A
11:00 AM - 12:30 PM
|
Nguyen (Nick) Nguyen
University of Miami
|
DeepAudio: An AI System to Complete the Pipeline of Generating, Selecting, and Targeting Audio Ads
-
Click to read Abstract
Audio advertising is a large industry reporting a billing of $14 billion in 2022 and reaching up to 86.8% of the U.S. population. Reflecting the importance of audio advertising, AI startups are offering marketers generative AI tools to efficiently create multiple audio ads. Also, ad targeting platforms like Spotify can deliver audio ads to targeted audiences. This raises a key question: Which of the numerous ads should marketers launch on the ad targeting platforms? Marketers may rely on conventional methods such as A/B testing or Multi-Armed Bandit to answer this question. However, they are slow and require significant resources, particularly when assessing numerous ad executions. Moreover, online audio platforms such as Spotify or iHeartRadio do not support A/B testing or Multi-Armed Bandit. Given this background, the authors propose DeepAudio, an AI system that integrates insights of behavioral literature on ad likeability with AI algorithms to automatically assess the likeability of audio ads. Benchmarking DeepAudio with different approaches, the authors find that integrating behavioral features into AI systems significantly increases system performance, robustness, and generalizability. By quickly assessing the likeability of multiple audio ads, DeepAudio enables marketers to select the most promising ad executions and fully harness the power of Generative AI. Thus, DeepAudio completes the modern pipeline of generating, selecting, and targeting audio ads.
|
Bowen Luo
|
10/11/2024
Melcher Hall 365A
11:00 AM - 12:30 PM
|
Yanyan Li
USC
|
Understanding Privacy Invasion and Match Value of Targeted Advertising
-
Click to read Abstract
Targeted advertising, advanced by behavioral tracking and data analytics, is now extensively utilized by firms to present relevant information to consumers, potentially enhancing consumer experience and marketing effectiveness. Despite these advantages, targeted advertising has raised significant privacy concerns among consumers and policymakers due to unintended consequences from the extensive collection and use of personal data. Consequently, comprehending the tradeoff between the enhanced match value and privacy concerns is crucial for effective implementation of targeted advertising. In this research, we develop a structural model to empirically analyze this tradeoff, addressing a gap in the literature. We assume consumers form correlated beliefs about privacy invasion and match value from targeted advertising in a Bayesian fashion, and use these beliefs to decide whether to click an ad and whether to opt out of ad tracking. Consumers update their privacy invasion beliefs by considering how each received ad corresponds to their clicked ads and update their match value beliefs by considering how well each ad engages them, and do so jointly due to potential correlation between these two beliefs. Leveraging the Limit Ad Tracking (LAT) policy change with iOS 10 in September 2016, which allows consumers to opt out of ad tracking, we estimate the proposed model using panel ad impression and consumer response data from 166,144 opt-out and 166,144 opt-in consumers, across two months pre and three months post of the policy change. We find that consumers generally have a negative preference for privacy invasion and a positive preference for match value in their clicking decisions, with notable heterogeneity in these preferences. Consumers with higher uncertainty about privacy invasion are more likely to opt out of tracking. Upon opting-out, highly privacy-sensitive consumers (about 20%) experience net benefits, while the majority faces a loss from reduced match value that outweighs their gain from decreased privacy invasion. Through counterfactual analyses, we propose a probabilistic targeting strategy which balances match value and privacy concern, and demonstrate that such privacy-preserving targeting strategy can benefit consumers, advertisers, and the ad network.
|
Sesh Tirunillai
|
10/9/2024
MH 126
11:00 AM - 12:30 PM
|
Maria Giulia Trupia
UCLA
|
''No Time to Buy'': Asking Consumers to Spend Time to Save Money is Perceived as Fairer than Asking Them to Spend Money to Save Time
-
Click to read Abstract
Firms often ask consumers to either spend time to save money (e.g., Lyft’s ''Wait & Save'') or spend money to save time (e.g., Uber’s ''Priority Pickup''). Across six preregistered studies (N = 3,631), including seven reported in the Web Appendix (N = 2,930), we find that asking consumers to spend time to save money is perceived as fairer than asking them to spend money to save time (all else equal), with downstream consequences for word-of-mouth, purchase intentions, willingness-to-pay (WTP), and incentive-compatible choice. This is because spend-time-to-save-money offers reduce concerns about firms' profit-seeking motives, which consumers find aversive and unfair. The effect is thus mediated by inferences about profit-seeking and attenuates when concerns about those motives are less salient (e.g., for non-profits). At the same time, we find that spend-money-to-save-time offers (e.g., expedited shipping) are more common in the marketplace. This research reveals how normatively equivalent trade-offs can nevertheless yield contradictory fairness judgments, with meaningful implications for marketing theory and practice.
|
Melanie Rudd
|
10/4/2024
Melcher Hall 365A
11:00 AM - 12:30 PM
|
Jasmine Y. Yang
Columbia University
|
What Makes For A Good Thumbnail? Video Content Summarization Into A Single Image
-
Click to read Abstract
Thumbnails, reduced-size preview images or clips, have emerged as a pivotal visual cue that helps consumers navigate through video selection while 'previewing' for what to expect in the video. We study how thumbnails, relative to video content, affect viewers' behavior (e.g., views, watchtime, preference match, and engagement). We propose a video mining procedure that decomposes high-dimensional video data into interpretable features (image content, affective emotions, and aesthetics) leveraging computer vision, deep learning, text mining and advanced large language models. Motivated by behavioral theories such as expectation-disconfirmation theory and Loewenstein's theory of curiosity, we construct theory-based measures to evaluate the thumbnail relative to the video content to assess the degree to which the thumbnail is representative of the video. Using both secondary data from YouTube and a novel video streaming platform called ''CTube'' that we build to exogenously randomize thumbnails across videos, we find that aesthetically pleasing thumbnails lead to overall positive outcomes across measures (e.g., views and watchtime). On the other hand, content disconfirmation between the thumbnail and the video leads to opposing effects. It leads to more views, higher watchtime but lower post-video engagement (e.g., likes and comments). To further investigate how thumbnails affect consumers' video choice and watchtime decisions, we build a Bayesian learning model in which consumers' decisions to click on a video and continue watching the video are based on their priors (the thumbnail) and updated beliefs of the video content (the video's frames, characterized as multi-dimensional and correlated video topic proportions). Our results suggest that viewers overall prefer watching videos longer when there is a higher disconfirmation between their initial content beliefs formed based on the thumbnail and updated beliefs based on the observed video scenes (signals), suggesting one role of thumbnails as generating curiosity for what may come next in the video. In addition, viewers prefer less disconfirmation before observing the thumbnail, highlighting the role of disconfirmation may change before and after the thumbnail. Based on the model's estimates, we then run a series of counterfactual analyses to propose optimal thumbnails and compare them with current practices of thumbnail recommendation to guide creators and platforms in thumbnail selection.
|
Sesh Tirunillai
|
9/27/2024
Melcher Hall 365A
11:00 AM - 12:30 PM
|
Hangcheng Zhao
University of Pennsylvania
|
Algorithmic Collusion of Pricing and Advertising on E-commerce Platforms
-
Click to read Abstract
Firms have been adopting AI learning algorithms to automatically set product prices and advertising auction bids on e-commerce platforms. When firms compete using such algorithms, one concern is that of tacit collusion—the algorithms learn to settle on higher than competitive prices which increase firm profits, but hurt consumers. We empirically investigate the impact of competing reinforcement learning algorithms to determine if they are always harmful to consumers, in a setting where firms learn to make two-dimensional decisions on pricing and advertising together. Our analysis uses a multi-agent reinforcement learning implementation of the Q-learning algorithm, which we calibrate to estimates from a large-scale dataset collected from Amazon.com. We find that learning algorithms can facilitate win-win-win outcomes that are beneficial for consumers, sellers, and even the platform when consumers have high search costs, i.e., the algorithms learn to collude on lower than competitive prices. The intuition is that algorithms learn to coordinate on lower bids, which lowers advertising costs, leading to lower prices for consumers and enlarging the demand on the platform. We collect and analyze a large-scale high-frequency keyword product search dataset from Amazon.com and estimate consumer search costs. We provide policy guidance by identifying product markets with higher consumer search costs that could benefit from tacit collusion, and markets where regulation on algorithmic pricing might be most needed. Further, we show that even if the platform responds strategically by adjusting the ad auction reserve price or the sales commission rate, the beneficial outcomes for both sellers and consumers are likely to persist.
|
Sam Hui
|
9/20/2024
Melcher Hall 365A
11:00 AM - 12:30 PM
|
Kyeongbin (KB) Kim
Emory University
|
Generative Multi-Task Learning for Customer Base Analysis: Evidence from 1,001 Companies
-
Click to read Abstract
Modeling the activity of a customer base is an inherently multi-objective problem. It requires understanding how many customers will be acquired over time, how many repeat purchases they will make, and how much they will spend per purchase (''upstream'' objectives), as well as how these behaviors come together into cohort- and company-level sales (''downstream'' objectives). There are many other empirical settings as well in which applied researchers face with problems that are similarly modeling multiple customer behavioral processes in nature. This paper introduces a flexible and adaptable unified generative multi-task learning approach tailored for panel data, the customer-based multi-task transformer (CBMT), designed to jointly project upstream and downstream outcomes. Our methodology balances the trade-off between generalization and specialization by leveraging commonalities across upstream customer behavioral processes through shared layers while enabling each objective tailored predictions via task-specific layers. By employing a multi-objective loss function that explicitly incorporates downstream objectives as auxiliary tasks, we obtain the best of both worlds: accurate predictions for each upstream outcomes while also ensuring strong goodness of fit in terms of the downstream outcomes. We validate the model on a one-of-a-kind dataset of 1,001 companies, over a 37-month observation period. Our model significantly outperforms existing approaches, exceeding six benchmark models in predicting company-level revenue by 34% to 65% over a temporal holdout period. We provide insight into the drivers of this outperformance and under what conditions the proposed model performs relatively better or worse through an ablation study and a performance heterogeneity analysis as a function of various contextual factors.
|
Sam Hui
|
9/13/2024
MH 365A
11:00 AM - 12:30 PM
|
Arpit Agrawal
University of Houston
|
So Near Yet So Far: The Unexpected Role Of Near Misses In Salesperson Turnover
|
|
9/6/2024
Melcher Hall 365A
11:00 AM - 12:30 PM
|
Aprajita Gautam
University of Texas at Austin
|
Product Perfectionism: Defining and Measuring Consumers' Tendency to Hold Uncompromisingly High Expectations from Possessions and Consumption Experiences
-
Click to read Abstract
Perfectionist tendencies have been on the rise in recent years. In this paper, we conceptualize and define a specific type of this tendency, called ''product perfectionism,'' and situate it within a broader nomological network that includes trait perfectionism, entitlement, materialism, and
maximizing. We construct an eight-item Product Perfectionism Scale, which we use to predict consumption behaviors across the three stages of a typical consumer's journey: acquisition,
consumption, and disposal (studies 1–7). We find that consumers higher (vs. lower) on product perfectionism are more susceptible to set-fit effects (study 1), attracted to brands with personalities associated more (vs. less) with perfection (study 2), and willing to pay more for newer (vs. older) products (study 3). We also find that they derive lower enjoyment from less-than-perfect consumption experiences (study 4), are more attracted to product upgrades (study 5), replace both perishable and non-perishable goods faster for smaller flaws (study 6), and are more likely to dispose of and are reluctant to repair broken possessions (study 7). We conclude the paper with a discussion of the theoretical and substantive implications of our findings.
|
Melanie Rudd
|
4/19/2024
Melcher Hall 365A
1:30 PM - 3:00 PM
|
Peggy Liu
University of Pittsburgh
|
Choosing Larger Food Portion Sizes for Others
-
Click to read Abstract
Consumers often choose portion sizes of food for others—including romantic partners, friends, and children. Across eight studies, the authors show that consumers choose larger portion sizes for others, compared to the portion sizes that consumers choose for themselves, the portion sizes that consumers predict others want to eat, and the portion sizes that others want to receive. Moreover, the authors show that one reason consumers choose larger portion sizes for others is because consumers aim to convey warmth via larger portion sizes. The tendency to choose larger portion sizes for others is mitigated when consumers focus on addressing alternative concerns besides conveying warmth, such as when addressing fairness concerns (e.g., wanting to ensure enough food for multiple others) or when addressing health concerns for others (e.g., choosing portion sizes of unhealthy food for dieting adults or for one's young children). Ironically, however, choosing larger portion sizes for others (vs. for self) does not actually convey greater warmth to recipients. Thus, the authors suggest that consumers' tendency to choose larger portion sizes for others may mainly have negative consumer and societal well-being implications, to the extent that receiving larger portion sizes might lead to overeating and/or food waste.
|
Byung
|
4/6/2024
CBB (Room 310)
8:15 AM - 4:30 PM
|
Doctoral Students
|
The 42nd Annual UH Marketing Doctoral Symposium
|
|
4/5/2024
CBB (Student Training Center)
3:45 PM - 6:00 PM
|
John G. Lynch, Jr.
University of Colorado-Boulder
|
The 42nd Annual UH Marketing Doctoral Symposium
|
|
4/5/2024
Melcher Hall 365A
11:00 AM - 12:30 PM
|
Ravi Mehta
University of Illinois - Urbana Champaign
|
Fostering Innovation Through Consumer Engagement: The Role of Luxury Brand Usage Experience
-
Click to read Abstract
Prior work has revealed negative demand-side implications of engaging consumers for their innovative inputs (e.g., user-generated designs) by luxury brands. Extending this line of work, we examine the supply-side (e.g., idea generation) implications of consumer engagement through the lens of luxury brand usage experience. Across five studies, we demonstrate that consumers who engage in luxury, as compared to non-luxury, brand experience through product usage generate more innovative ideas. This effect stems from feelings of vertical differentiation that arise from using a luxury brand. Explicating the boundary conditions of the effect, we also demonstrate that such a positive effect of luxury brand usage experience on consumer innovativeness does not emerge when 1) individuals are merely exposed to a luxury brand (i.e., engaging with it is a necessary condition), and 2) when the luxury brand usage fails to convey the signals of prestige and exclusivity central to feelings of vertical differentiation. The current research offers theoretical and managerial implications for luxury brand and innovation management.
|
Sesh
|
3/29/2024
Melcher Hall 365A
11:00 AM - 12:30 PM
|
Huanhuan Shi
Texas A&M
|
The Economic Consequences of Risk-Absorption in B2B Relationships: Evidence from Indirect Auto Lending
-
Click to read Abstract
Risk absorption is ubiquitous in inter-firm relationships. For example, manufacturers may cover small suppliers' production cost risks, and suppliers might take on some buyers' payment default risks. This risk absorption aims to enhance value and
foster long-term partnerships. Despite recognizing this intention, there is limited research on the economic consequences for the risk-taker. The authors examine the costs and benefits of risk absorption in indirect car loan markets, where third-party
lenders rely on auto dealers to reach consumers. In this scenario, risk absorption happens when third-party lenders approve loans for high-risk consumers, allowing auto dealers to complete transactions, even for consumers who typically would not
qualify for loans. Using a comprehensive dataset spanning three years from a third-party lender working with 1,550 dealers, the authors explore (1) the actual costs related to risk absorption, such as loan defaults and overdue payments, (2) how dealers reciprocate and benefit by assessing how lender risk absorption affects dealer referrals in terms of loan amounts (gains) and loan risk (losses), and (3) heterogeneity in how dealers respond to risk absorption. While lenders incur direct costs for offering favorable loans, dealers reciprocate by generating more loan applications in the future. However, the risk associated with referred loans also increases, possibly due to perceived leniency by the lender being exploited. Additionally, we find that as the relationship between a lender and a dealer extends, the reciprocal effects initially strengthen and then decline, while the direct cost and exploitation effect increases monotonically. These findings provide insights into lenders’ effectiveness in absorbing risks and strategies for managing relationships with dealers.
|
Johannes
|
3/22/2024
Melcher Hall 365A
11:00 AM - 12:30 PM
|
James Reeder
University of Kansas
|
Using Contextual Embeddings to Predict the Effectiveness of Novel Heterogeneous Treatments
-
Click to read Abstract
Our study demonstrates the power of using contextual embeddings to evaluate new marketing creatives. To explore the usefulness of contextual embeddings, we exploit an advantageous setting where 34 emails were sent to 1.3 million clients over a 45-day period. To create our contextual embeddings, we use OpenAI to translate email subject lines into Ada embeddings, a class of contextual embeddings. We find that Ada embeddings outperform other commonly used, text classification methods. Further, through a series of leave-one-out exercises, over 85% of our candidate emails have accurate predictions between our contextual embeddings model and the doubly robust scores obtained from market campaigns. Given that OpenAI generated our contextual embeddings, we link our method with ChatGPT to evaluate newly generated email creatives. ChatGPT is given a set of email features that show common support within our observed dataset. We find that while ChatGPT is able to generate novel email promotions quickly, it is unable to accurately predict the success of the new emails. We then conclude by using a data-driven algorithm to conduct a heuristic-driven targeting exercise using both the GPT developed emails and client characteristics. Our study shows the viability of contextual embeddings in academic research and in assessing the development of novel heterogeneous treatments.
|
Martin
|
2/16/2024
Melcher Hall 365A
11:00 AM - 12:30 PM
|
Vamsi Kanuri
University of Notre Dame
|
Mitigating Churn After Financial Fraud: The Value of Blame Attribution During Service Recovery
-
Click to read Abstract
The financial industry is plagued with account-based fraudulent transactions in which perpetrators surreptitiously siphon away money from customer accounts. To address this recurrent problem, service operations teams at financial institutions (e.g., the fraud detection and mitigation teams) spend millions yearly identifying perpetrators, attributing blame, and averting customer churn. However, in most cases, the service operations teams cannot trace fraudulent transactions back to the perpetrators. Hence, it remains unclear whether they should continue to expend significant resources into investigating account-based fraud and attributing blame. Employing a wide range of econometric and machine learning techniques on a rich dataset from a U.S. bank, we attempt to offer the first empirical assessment of the association between blame attribution, or lack of it, in response to account-based fraud and customer churn. The results indicate that relative to the customers who did not experience fraud, those who did and received a resolution lacking blame attribution are 40.69% (95% CI = [24.41%, 56.97%]) more likely to churn permanently. However, customers who experienced fraud and received a resolution involving blame attribution are 62.45% (95% CI = [31.22%, 93.68%]) less likely to churn than those who did not experience fraud. The latter result offers the first field evidence of the service recovery paradox. Additionally, we observe significant heterogeneity in the findings based on customer tenure and the number of customer-firm interactions and document the long-term effects of attributing and not attributing blame. These insights can assist service operations teams in their post-recovery efforts to mend customer-firm relationships following a service failure. Overall, the results underscore the importance of blame attribution during service recovery for service operations teams and inform a topical debate on a banking reform proposed by the U.S. Department of Treasury.
|
Martin
|