Decoding Visual Emotions: Impact on Sales from Yelp Photo Review

Visual sentiment is an important yet understudied factor in customer engagement on digital platforms. This study investigates the impact of visual sentiment in Yelp image reviews on consumer decision-making, specifically focusing on review frequency and customer engagement. While prior research has predominantly examined textual reviews, visual content introduces a new layer of sentiment, providing valuable emotional cues that impact sales and consumer trust. Using a dataset enhanced from Yelp's Open Dataset, comprising over 700,000 reviews from 150,000 businesses, we apply the CLIP-E model to categorize emotions in images. Sentiment is scored across key emotional categories—'anger,' 'fear,' 'joy,' 'love,' 'sadness,' and 'surprise'—alongside a binary sentiment score. In conjunction with textual sentiment analysis using Spacy, TextBlob, and Vader, we analyze the effect of visual sentiment on business outcomes. We examine how both visual and textual sentiments correlate with metrics like review frequency and spatial footprint of customer engagement. Through statistical methods including OLS, ridge regression, gradient boosting, and XGBoosting, we explore how these multimodal sentiments affect consumer behavior and provide insights for businesses to optimize visual elements to enhance consumer engagement and sales. This research contributes to the growing field of multimodal sentiment analysis, helping local businesses curate images that foster stronger connections with customers and drive business growth.