I need 1000 word literature review of two article so each 500 word

My topic :Applicatyion of text mining on product satisfaction and feedback

Literature Review: Describe two (2) references (must be research articles from journal/conference/academic report/thesis) that are relevant to your topic. Include the general background of references, dataset used, details of how the text mining process is applied, as well as relevant findings and conclusions. Discuss the implications of the references to the current project.  See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/344875850
Exploring healthcare/health-product ecommerce satisfaction: A text mining
and machine learning application
Article in Journal of Business Research · October 2020
DOI: 10.1016/j.jbusres.2020.10.043
CITATIONS
READS
29
384
4 authors, including:
Swagato Chatterjee
Jiwan Sharma
Indian Institute of Technology Kharagpur
Indian Institute of Technology Kharagpur
20 PUBLICATIONS 253 CITATIONS
1 PUBLICATION 29 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
Studies on User Generated Content View project
B2B Marketing View project
All content following this page was uploaded by Swagato Chatterjee on 25 October 2020.
The user has requested enhancement of the downloaded file.
SEE PROFILE
Journal of Business Research xxx (xxxx) xxx-xxx
Contents lists available at ScienceDirect
Journal of Business Research
PR
OO
F
journal homepage: http://ees.elsevier.com
Exploring healthcare/health-product ecommerce satisfaction: A text mining and
machine learning application
Swagato Chatterjee a ,1 ,⁎ , Divesh Goyal b ,1 , Atul Prakash b ,1 , Jiwan b ,1
a
b
Vinod Gupta School of Management, Indian Institute of Technology, Kharagpur, Kharagpur, West Bengal 721302, India
Indian Institute of Technology, Kharagpur, Kharagpur, West Bengal 721302, India
ABSTRACT
Keywords
In the digital era, online channels have become an inevitable part of healthcare services making healthcare/
health-product e-commerce an important area of study. However, the reflections of customer-satisfaction and
their difference in various subgroups of this industry is still unexplored. Additionally, extant literature has majorly focused on consumer surveys for customer-satisfaction research ignoring the huge data available online.
The current study fills these gaps. With 186,057 reviews on 619 e-commerce firms from 29 subcategories of
healthcare/health-product industry posted in a review-website between 2008 and 2018, we used text-mining,
machine-learning and econometric techniques to find which core and augmented service aspects and which emotions are more important in which service contexts in terms of reflecting and predicting customer satisfaction.
Our study contributes towards the healthcare/health-product marketing and services literature in suggesting an
automated and machine-learning-based methodology for insight generation. It also helps healthcare/health-product e-commerce managers in better e-commerce service design and delivery.
UN
CO
RR
EC
TE
D
ARTICLE INFO
Health-product ecommerce
Text mining
Sentiment
Emotion
Customer satisfaction
Online reviews
1. Introduction
Consumer perception is vital for any organization, regardless of them
being product and/or service-based. Both positive and negative perceptions resulting in consumer feedback and reviews are crucial for organizations to weigh their consumer-base. Consumer reviews provide such
information, which assist organization to churn up various matrices like
customer satisfaction (CSAT) and net promoter score (NPS) (Ho-Dac,
Carson, & Moore, 2013). With deep internet penetration even in the
remotest locations, consumers today are hooked online, whereby they
share information, views in various online platforms via consumer reviews (Park, Gu, Leung, & Konana, 2014). While on the one hand,
some organizations have a place within their website to enable the consumer to share his/her views/information through standardized quantitative or rating based fields, others have textual reviews; at times, both
exist in coherence (Siering, Deokar, & Janze, 2018).
‘Textual reviews’ wherein a consumer can pour his/her heart out either in frustration or happiness are certainly one of the bests in terms
of ‘informative content’. Through this medium, organizations get a detailed understanding of consumer sentiments and emotions. Further,
organizations do also get key insights into ‘consumer psychology’ in
terms of how a consumer initial perceived a product/service vis a vis

1
how s/he evaluated it post acquisition (Ye, Zhang, & Law, 2009).
In fact, this insight helps specifically multifaceted service industries,
like healthcare organizations for instance to deep-dive further to find
how such sentiments, emotions and evaluations thereof actually lead
the consumer to provide ratings. Importantly, with many healthcare/
health-product ecommerce organizations now being in the fray, almost
all of them seem to be preferring an omni-channel approach, whereby
‘that’ understanding gains further relevance.
Extant literature has extensively talked about how consumer reviews
affect both an existing customer and a new customers’ decision-making
and the overall perceptions of the organization and its brand (Sharp,
2011). Studies that have primarily focused on healthcare services, have
gone on to elaborate the rationales for making an online review helpful, almost to the point of it being ‘invaluable’ (Sandars & Walsh,
2009). However, extant literature has not focused on how the textual reviews can be used to find the reflectors and predictors of customer satisfaction in healthcare/health-product ecommerce (Sandars
& Walsh, 2009; Sharp, 2011). This understanding is important as
such an idea will help the healthcare/health-product ecommerce managers to make better service design, improved customer relationship
management and efficient handling of customer reviews. The current
study fills this gap. The key research question is: (a) How the opinion
of the consumers of various service attributes leads to their overall sat
Corresponding author.
E-mail address: swagato1987@gmail.com (S. Chatterjee)
All authors have equal contribution.
https://doi.org/10.1016/j.jbusres.2020.10.043
Received 10 January 2020; Received in revised form 12 October 2020; Accepted 14 October 2020
Available online xxx
0148-2963/© 2020.
S. Chatterjee et al.
Journal of Business Research xxx (xxxx) xxx-xxx
isfaction on healthcare/health-product ecommerce? (b) Whether importance of such attributes vary depending on the ecommerce subcategory?
(c) Whether the textual reviews can be used to answer the above questions? (d) Whether the emotions expressed in such reviews reflects customer satisfaction?
Herein, we look to analyze consumer reviews and ratings specifically for the heathcare/health-product industry; primarily, healthcare/
health-product ecommerce. In this attempt, at first, we analyzed the text
of multiple reviews to explore diverse core and augmented (C&A) service aspects based on which consumers tend to give their reviews (textually). Then we looked at exploring how these overall and attribute wise
sentiments and emotions lead to CSAT. Further, we show how healthcare/health-product ecommerce contexts change the above mentioned
relationships. For example, the consumer expectations for a pharmacy
and drugs ecommerce and a beauty products ecommerce is expected to
be different. We explore such aspects in the third step. We also check the
predictive power of the above mentioned variables to predict consumer
satisfaction.
The structuring of this paper from hereon is as follows: the next section covers the theoretical model, followed by the methodology and the
results. Discussion along with both the theoretical and practical implications follow. We conclude by highlighting the limitations and mention
the future scopes as well.
2.2. Online reviews
UN
CO
RR
EC
TE
D
PR
OO
F
Online reviews also are extremely informative for ‘prospective consumers’, who’re possibly uninformed or even ill-informed. The reviews,
especially the ones, which are consistent, affect the purchase decision-making process. Organizations, thereby at times, often tend to go
out of their way in trying to ensure getting positive reviews and ratings,
which in turn helps them leverage their brand worth and brand value.
Extant literature has covered many dimensions of online reviews vis
a vis its relevance and importance, when it comes to consumer decision-making, which in turn affects organizations’ bottom line (Chevalier & Mayzlin, 2006; Duan, Gu, & Whinston, 2008). Extant literature has also focused on pricing and promotional strategies for organizations, for whom it is like a multi –period game (Ajorlou, Jadbabaie,
& Kakhbod, 2016). While in the first half of the game, the focus remains on generating favorable online reviews, and the second half looks
to leverage on the positive impact of the first half, which goes on to affect their price, sales and profits (Ajorlou et al., 2016). Understanding
the underlying consumer psychological mechanisms leading to favorable
reviews, vis a vis how organizations motivate consumers to do the same
is also another area that has been explored in the past (Hennig-Thurau, Gwinner, Walsh, & Gremler, 2004; Mowen, Park, & Zablah,
2007). What motivates the bandwagon behavior in terms of providing
incongruous online reviews are also explored (Cheung & Lee, 2012).
Some researchers have found how the various types of customers or various types of purchase contexts can lead to difference in preferences and
drivers of customer satisfaction (Ahani et al., 2019; Xu, 2020).
We primarily focus here on both metric and textual aspect of online reviews encompassing thereby the consumer sentiments and emotions holistically (Chatterjee, 2019; Siering et al., 2018). Interestingly enough, extant literature in healthcare services has not combined
qualitative and quantitative information while explaining consumer satisfaction (Ng & Luk, 2019). In fact, in that sense, our study contributes
to the extant literature (Ng & Luk, 2019). Moreover, it is important
to note herein that attribute-wise sentiment mining and emotion mining
has remained very limited, especially in healthcare marketing literature.
2. Background study
2.1. Ecommerce customer satisfaction and customer ratings
Customer rating (CR) has been a major variable for marketers when
it comes to assessing the progress of their action (Anderson, Fornell, & Lehmann, 1994). CR is known to enhance customer purchases, be it a new purchase or a repeat, which naturally results in organizational profitability (Anderson et al., 1994; Söderlund, 1998).
But, what drives CR is a question that has plagued researchers for
decades (Anderson & Sullivan, 1993; Martensen, Gronholdt, &
Kristensen, 2000; Mouwen, 2015). This is particularly seen in cases
where consumer heterogeneity and multiple business models exist (Grewal, Chandrashekaran, & Citrin, 2010). In fact, it is this ‘heterogeneity’ that leads to differential importance from consumers to differing service attributes.
To explain CSAT in ecommerce, extant literature has explored various underlying constructs like value, trust and service quality (Oh,
1999; Szymanski & Hise, 2000; Taylor and Baker, 1994; Zeithaml,
Parasuraman, & Malhotra, 2002), using survey-based methods for
data collection (Pappas, Pateli, Giannakos, & Chrissikopoulos,
2014; Wang et al., 2019). Nevertheless, it is important to understand
user-generated ratings using user-generated information, as they would
be free of various biases. Studies using user-generated content has focused on how pre-purchase and post-purchase attribute wise ratings impact the CR or CSAT on ecommerce (Posselt & Gerstner, 2005). Some
have also tries to check whether the impact of such variables change
over time and over product category (Dholakia & Zhao, 2010; You,
Bhatnagar, & Ghose, 2016). However, it may be noted that both qualitative and quantitative data need to be combined in order to reflect or
predict CR, as textual reviews are often rich source of information and
the correct manifestation of consumer opinions. Extant literature has not
focused on this aspect while studying consumer satisfaction with ecommerce firms using user-generated content (Dholakia & Zhao, 2010;
Posselt & Gerstner, 2005). Our study is crucial in the sense that it acts
as a bridge between theory and practice. We propose a methodology,
whereby we look to create insights from user ratings through textual reviews by using text mining along with econometric and machine learning methods.
2.3. Text mining
True that ‘structured data’ is more comprehensible and useful; nevertheless, ‘unstructured data’ can yield much more information provided it is analyzed by combining both qualitative and quantitative techniques. ‘Text mining’ is one of the ways in which quantitative insights
may be generated even from unstructured textual data. ‘Text analytics’ on the other hand, transforms the data processed from text mining,
and creates actionable insights thereof. Text mining is used in document
classification, topic modelling, translation, language identification, fake
news detection, semantic mining, and chatbot development. Herein, following Hotho, Nürnberger, and Paaß (2005), we attempt to delve
into textual reviews by pre-processing the text data at first, followed by
text mining to create analyzable data. Further, text analytics is used to
generate actionable insights.
However, in order to apply text-mining methods, it is important to
have the unstructured text data clean and thereby ready; text pre-processing precisely helps in this endeavor. Pre-processing includes three
phases: stop words removal, stemming and important word identification (Vijayarani, Ilamathi, & Nithya, 2015). Stop words include unnecessary words such as pronouns, prepositions, etc. Essentially, words
that do not ‘value-add’ to the research context. Removal of such words
is a necessity thereof, whereby the size of the text data reduces, and it
helps in faster processing of the text data. Moreover, it also helps in ensuring that important data aren’t lost in the mix, which are critical for
text analysis (Feldman & Sanger, 2007). Removing ‘stop words’ in
2
S. Chatterjee et al.
Journal of Business Research xxx (xxxx) xxx-xxx
sumer’s overall satisfaction depending on their own and other aspects
accessibility and diagnosticity.
As per multiple pathway anchoring and adjustment (MPAA) model,
both personal characteristics of consumers (inside-out) and multiple attributes of services (outside-in) go on to build the consumer’s overall attitude towards a service (Cohen & Reed, 2006). Therefore, a combination of multiple internal and external forces lead to the final consumer
outcomes in terms of his/her purchase-decision making process for example. Factors that may impact the attitude formation are direct/imagined experience with the object, analytical attitude formation method,
analogical reasoning, value and social-identity driven attitude etc. (Cohen & Reed, 2006). All of these together suggests that consumer attitude formation results from a complex mix of various types of factors.
Textual reviews provided by consumers give a vivid description of
their experiences with a service (Chatterjee, 2019). Their sentiments
expressed in textual reviews therefore about C&A service aspects are a
rich source of information about a consumer’s attitudes. As discussed
earlier, such information are both accessible and diagnosable via text
mining techniques, which essentially make them as primary reflectors
of CSAT, as suggested by AD model (Lynch, 2006; Vaidyanathan,
2000). Extant literature explored the differential impact of the type of
service aspects on consumer outcomes, which are of two types, i.e. core
aspects, which provide basic benefits, such as food in a restaurant; and
augmented aspects, which provide additional benefits, such as live music in a restaurant (Chatterjee, 2019). In a health ecommerce setting, it
is important to study how sentiments towards C&A service aspects lead
to overall satisfaction.
For the sake of managing consumer reviews in open online channels,
it is also important to predict how changes made in C&A service aspects
can affect consumer ratings. Therefore, the predictive power of the C&A
service aspects in overall satisfaction is also an important area of study.
Therefore, we posit:
UN
CO
RR
EC
TE
D
PR
OO
F
volves multiple methods, such as mutual information method, Zipf’s
law method, classic method, and term-based random sampling. Once
the clean dataset is available, they are stemmed to develop connections within sentences in an attempt to reduce similar information content (Vijayarani et al., 2015). Stemming may be done either by word
truncation or statistical and/or mixed methods. The sole objective both
stemming and stop words removal is to find the most important word.
A common word for instance within the whole corpus is less important;
however, a word oft repeated within a ‘particular’ document is certainly
very important. Interestingly, this logic is captured in term frequency-inverse document frequency (TF-IDF) scores, which is used as a proxy of
importance score of the words (Feldman & Sanger, 2007; Vijayarani
et al., 2015).
The ‘bag-of-words’ (BoW) and parts-of-speech (POS) methods have
been used for text analytics based on Brill (1995) recommendations.
Further, we used POS tagging to identify the parts of speech a word
per se. Herein, it may be noted that this is a very common method for
feature selection from text (Asghar, Khan, Ahmad, & Kundi, 2014).
Generally, when consumers articulate their views, they’re nouns, while
the views in themselves are adjectives; for example: “The bar in this hotel is classy”; while ‘bar; is the noun, ‘classy’ is the adjective. Post the
POS tagging, we use the BOW method after considering the nouns which
carry the highest TF-IDF measures, thereby most important (Salton &
Buckley, 1988); further, we club them under various service aspects.
Scores of these words are further used in other data mining techniques
in order to generate additional insights (Chatterjee, 2019).
Sentiment mining for ecommerce aspects as referred to above is the
most common text analytics method; it could be done at the document
level, sentence level or feature level. The two most common ways of
identifying ‘sentiment’ include the Lexicon-based approach and statistical learning based approach (Feldman, 2013); while the most common way of identifying a sentiment within a text from the Lexicon-based
approach is summating the sentiment scores of all the words in the
text. The ‘statistical learning based approach’ could also be an alternative, whereby pre-marked data are used in various cutting edge machine
learning techniques.
Given the prominence and importance of online reviews today, sentiment mining has become crucial in providing essential information, especially through overall as well as feature-wise sentiment mining techniques (Siering et al., 2018). Unfortunately extant research in the
context of healthcare hasn’t explored this feature enough (Chatterjee,
2019; Popescu & Etzioni, 2007; Siering et al., 2018). Given that
we’ve attempted to use the same, our study gains more salience in terms
of its contribution to extant literature (Popescu & Etzioni, 2007).
H1 Sentiment towards C&A service attributes has positive relationship with CSAT in healthcare/health-product ecommerce industry.
As per the MPAA model, along with service attributes, a consumer’s
personal characteristics do effect his/her overall attitude towards a service (Cohen & Reed, 2006). Consumer characteristics typically tend to
finds its way of expressions through consumer emotions, which in turn
leads to consumer outcomes. As per MPAA model, consumption emotions act as a medium for both outside-in and inside-out expressions
of the consumers, which thus influence overall satisfactions (Cohen &
Reed, 2006). Understandably, while positive emotions lead to favorable judgements, negative emotions may lead to harsher evaluations.
Textual reviews are certainly rich in information when it comes to
consumer emotions. Extant literature has focused on how consumer
emotions can be extracted from textual reviews (Chatterjee, 2019).
However, unlike sentiment scores, emotion scores are multidimensional
(Westbrook & Oliver, 1991); for instance, while on the one hand,
we have positive emotions such as joy, trust, surprise etc., our negative
emotions comprise as sadness, disgust, anger etc. Importantly, such emotions do not necessarily fall within the same dimension; in other words,
they not only vary in terms of valence and degree, but also in terms
of meaning and source (Westbrook & Oliver, 1991). Extant literature
has dealt in detail on this aspect, establishing that such emotions do reflect a consumer’s attitude and behavior (Chatterjee, 2019; Laros &
Steenkamp, 2005).
However, extant literature has suggested that negative emotions lead
to more diagnosticity (Filieri, 2016), which essentially makes the input variable stronger, as per AD model. Further, Cavanaugh, MacInnis, and Weiss (2016) have categorized emotions based on valence
and arousal; for instance, while sadness may be a negative emotion,
it is low on the arousal factor, while anger on the other hand, de
3. Hypotheses development
Exploring the antecedents of CSAT has been an important research
domain in extant literature, covering service quality, trust, perceived
value etc. (Garbarino & Johnson, 1999; Oh, 1999). There have also
been attempts to understand how individual service attributes can lead
to CSAT, as they are more accessible and diagnostic in nature. Moreover, according to the accessibility-diagnosticity (AD) model, such accessible and diagnostic input variables do lead to consumer outcomes,
whereby we can consider individual service attributes as primary drivers of CSAT ratings (Vaidyanathan, 2000). However, these variables
have different levels of accessibility and diagnosticity, which are reliant
on consumer knowledge and/or his/her lack of information thereof. According to AD model, the influence of the memory of an input A on
the attitude formation is directly proportional to its accessibility and
inversely proportional to its diagnosticity. Moreover, the same is inversely proportional to its accessibility and directly proportional to its
diagnosticity of other inputs (Lynch, 2006). Extending the above, the
evaluation of various service aspects would have varied impact on con
3
S. Chatterjee et al.
Journal of Business Research xxx (xxxx) xxx-xxx
spite being a negative emotion too, is high on arousal. Additionally, it is
important to note that a high arousal emotion is highly accessible, as it
overcomes other cognitive processing (Filieri, 2016; Salehan & Kim,
2016), resulting thereby in higher effects of high arousal emotions. All
these together lead to a very interesting focal point for our study, i.e.
whether arousal, degree or valence of an emotion leads to different consumer outcomes.
The relationship with consumption emotion and customer satisfaction has been explained by the pleasure-arousal (PA) model by Ladhari
(2007). It suggests that the pleasure and arousal component of consumption emotions leads to positive cognitive state which in turn results
in satisfaction and positive WOM. Extending the above, we argue that
the expression of consumption emotions can be found in the textual reviews. Therefore the consumption emotions expressed in the textual reviews can reflect the satisfaction of the consumers. As per the PA models, as pleasure effects satisfaction, positive emotions are expected to be
positively related with satisfaction. High arousal emotions are also expected to be more related to satisfaction that low arousal emotions (Ladhari, 2007). Therefore we posit:
agers to create marketing plans focused to their own industry, while
helping them manage customer reviews better.
Therefore, we further posit:
PR
OO
F
H3 The relationship strengths of the overall sentiment, aspect wise
sentiments and emotions expressed in the textual review vary depending
on the type of healthcare/health-product ecommerce.
All of the above hypotheses are important, as following Xu (2020),
what consumers state in their reviews and what actually drives their satisfaction can be very different. This is because the underlying mechanism of review writing and underlying mechanism of customer satisfaction can be very different (Xu, 2020). Though we rely on the truthfulness of the review and emotion expressed, mere trivial relationship between the sentiments, emotions and overall satisfaction may not be true.
Therefore further probe is important. Our approach is different from
Xu (2020) as we adopted text mining and machine learning techniques
along with econometric techniques to explain and predict customer satisfaction.
4. Empirical study
H2a Overall sentiment in textual review has significant relationships
with CSAT in healthcare/health-product ecommerce industry.
4.1. Data and processing
H2b Emotions expressed in textual review have significant relationships with CSAT in healthcare/health-product ecommerce industry.
UN
CO
RR
EC
TE
D
We have collected data about healthcare/health-product ecommerce
firms from a website called trustpilot.com, which collects customer reviews about all types of ecommerce. We collected 186,057 reviews under the ‘Health and wellbeing’ category, which included 29 sub-categories including 619 posts from healthcare/health-product ecommerce
firms, posted between 2008 and 2018. The dataset had CR, a proxy
of CSAT, in 1 to 5 point scale (1 = highly dissatisfied, 5 = highly satisfied), along with the textual review (title and main content) on the
ecommerce firms. Fig. 1 summarizes the data processing and analysis
framework. At first, we removed number and stop-words, blank spaces
and punctuations etc. to make the initial pre-processed corpus. Next,
we used lexicons NRC Word-Emotion Association Lexicon (also called
EmoLex), created by Mohammad and Turney (2013) and found to be
suitable for consumer review based analysis (Chatterjee, 2019; Siering et al., 2018) to get the overall sentiments (negative, positive)
and 8 basic emotions from the text as listed in Table 1. In fact, similar methodology is common in information systems, data science and
marketing literature (Dang, Zhang, & Chen, 2010; Mostafa, 2013;
Taboada, Brooke, Tofiloski, Voll, & Stede, 2011).
Extant literature has suggested that consumers provide differential
importance to various service aspects depending on the context of service (Xu, 2020). For instance, consumers give differential importance
of service features for restaurants of different business models, such as
fine dining vs. fast food restaurants. In an adventure travel business context, the relative importance of C&A service aspects tend to vary based
on gender, demographics, travel goals and level of adventure (Matzler, Füller, Renzl, Herting, & Späth, 2008). In fact, in the hotel industry, the attribute level information generated from textual reviews have different influence on customer satisfaction depending on
the type of the hotel (Xu, 2020). It has been also found that factors
that consumers talk about and the factors that lead to their customer
satisfaction can be different set of variables (Xu, 2020). The health
oriented ecommerce industry also consists of various types of ecommerce. Some are generic, while some others focus on certain product
segments, such as personal care, drug and pharmacy, eye-care, skincare,
home health care etc. Therefore, the relative importance of C&A services vis vis the consumer emotions in reflecting and predicting consumer outcomes is expected to be different under these different contexts. Specific knowledge of such feature importance would help man
For sub-category-wise analysis, we have chosen top six sub-categories based on the number of reviews available: fitness and nutrition
Fig 1. Flowchart for data handling and model building.
4
S. Chatterjee et al.
Journal of Business Research xxx (xxxx) xxx-xxx
Table 1
Summary statistics of the variables in the models.
Beauty and
Wellness
Drugs and
Pharmacy
Maximum
4.52
1.06
1
5
0.40
0.41
−1
1
0.36
0.31
−1
1
0.21
1.05
0.17
0.30
0.98
0.33
0.41
1.23
4.39
0.65
1.52
0.58
0.85
1.32
0.89
0.81
1.75
1.19
0
0
0
0
0
0
0
0
1
21
44
17
49
38
26
21
50
5
0.4
0.43
−1
1
0.36
0.31
−1
1
Cosmetics
Skincare
Customer
Rating
Overall
Sentiment
Title
Overall
sentiment
Review
Anger
Anticipation
Disgust
Fear
Joy
Sadness
Surprise
Trust
Customer
Rating
Overall
Sentiment
Title
Overall
sentiment
Review
Anger
Anticipation
Disgust
Fear
Joy
Sadness
Surprise
Trust
Customer
Rating
Overall
Sentiment
Title
Overall
sentiment
Review
Anger
Anticipation
Disgust
Fear
Joy
Sadness
Surprise
Trust
2.26
0.74
0.29
1.14
0.24
0.36
1.06
0.44
4.53
3.06
1.67
0.79
1.79
0.73
0.96
1.53
1.07
1.04
0
0
0
0
0
0
0
0
1
84
56
18
44
17
22
38
26
5
0.4
0.41
−1
1
0.36
0.3
−1
1
2.28
0.6
0.24
1.08
0.2
0.31
1.1
0.34
4.64
2.66
1.42
0.68
1.52
0.65
0.9
1.41
0.93
0.91
0
0
0
0
0
0
0
0
1
53
51
17
31
15
49
23
26
5
0.4
0.4
−1
1
0.37
0.31
−1
1
1.84
0.46
0.17
0.92
0.11
0.24
0.74
0.26
2.15
1.23
0.57
1.31
0.45
0.78
1.05
0.8
0
0
0
0
0
0
0
0
57
26
12
28
9
27
29
16
Standard
Deviation
Minimum
Maximum
4.42
1.17
1
5
0.39
0.43
−1
1
PR
OO
F
Fitness and
Nutrition
Customer
Rating
Overall
Sentiment
Title
Overall
sentiment
Review
Anger
Anticipation
Disgust
Fear
Joy
Sadness
Surprise
Trust
Customer
Rating
Overall
Sentiment
Title
Overall
sentiment
Review
Anger
Anticipation
Disgust
Fear
Joy
Sadness
Surprise
Trust
Customer
Rating
Overall
Sentiment
Title
Overall
sentiment
Review
Anger
Anticipation
Disgust
Fear
Joy
Sadness
Surprise
Trust
Customer
Rating
Overall
Sentiment
Title
Overall
sentiment
Review
Anger
Anticipation
Disgust
Fear
Joy
Sadness
Surprise
Trust
Minimum
UN
CO
RR
EC
TE
D
All
Mean
Standard
Deviation
Mean
Eye treatment
0.37
0.32
−1
1
1.98
0.51
0.21
1.01
0.16
0.22
0.94
0.28
4.56
2.54
1.35
0.68
1.52
0.58
0.67
1.27
0.81
1.05
0
0
0
0
0
0
0
0
1
57
30
14
26
16
17
26
22
5
0.41
0.41
−1
1
0.41
0.33
−1
1
1.83
0.41
0.16
0.95
0.13
0.17
0.87
0.21
4.34
2.35
1.2
0.58
1.41
0.49
0.59
1.19
0.71
1.23
0
0
0
0
0
0
0
0
1
67
29
14
29
10
16
21
25
5
0.38
0.41
−1
1
0.37
0.33
−1
1
1.68
0.42
0.14
0.92
0.12
0.17
0.72
0.23
2.02
1.08
0.5
1.26
0.44
0.55
0.95
0.67
0
0
0
0
0
0
0
0
41
24
8
16
6
10
10
13
(40,708), beauty and wellness (35,065), drugs and pharmacy (15,443),
cosmetic (13,121), skincare (12,795) and eye treatment (10,269). We
found that the attribute-specific sentiments expressed in the text for
these six sub-categories only as the attributes are different for different
sub-categories. Further, we followed the bag-of-words method suggested
by Chatterjee (2019) for finding sentiments attribute-wise; at first we
found the nouns which occurred at least in 5% of the reviews (using
package developed by Nguyen, Nguyen, Pham, and Pham (2016)).
Following this, 4 experts and 9 users of healthcare/health-product ecommerce helped us to divide the nouns in various service attributes. The
final list of nouns in various service attributes have been given in a
supplementary file. The attributes found included service, product,
delivery, price, facility, equipment and time. Further, in order to find
attribute-wise sentiments, we have broken the texts in sentences to
see if at least one word did relate to an existing attribute, following
which we looked for the sentiment of such sentences. For example, a
review on beauty and wellness segment says: “Love the product I or
5
S. Chatterjee et al.
Journal of Business Research xxx (xxxx) xxx-xxx
dered (BRANDNAME) – so pigmented and long-wearing. it’s hard to
believe it’s not conventionally made. Love the free shipping and the
eco-conscious packaging!”. Based on the bag of words, here the part
which is relevant to the attribute called “product” is “Love the product I
ordered and the eco-conscious packaging!”, Sentiment of this portion is
used as the sentiment of “product” for the given review. Table 1 gives
the statistical summary of the data.
4.4. Feature importance comparisons
PR
OO
F
For robustness check of the results obtained in the explanatory models, we further analyzed the predictive models to get the feature importance of various emotions and aspect wise sentiments for various subcategories. The supplementary file has detailed values of the feature
importance scores, expressed in percentage terms where the total of the
feature importance of all emotions and aspect wise sentiments is 100.
As per the results, joy, anger and disgust are most important emotions, while anger plays a very important role for cosmetics and disgust
for eye-treatment. Unlike the regression results, in the predictive models, we find little feature importance of fear. Anticipation, sadness, trust
and surprise are of less importance.
In terms of the service aspects, product, service and delivery are most
important aspects as compared to the other four aspects. Service plays a
very important role in case of fitness and nutrition, while product plays
a very important role in beauty and wellness and cosmetics sub-categories. In general, price plays a small role in the beauty and wellness
category. Time is a crucial aspect for eye treatment. Figs. 2 and 3 gives
the graphical representation of the above results.
4.2. Explanatory models
We have used linear regression analysis for finding the explanatory
power of the insights generated from the review text. This is done in line
with extant literature (Chatterjee, 2019; Siering et al., 2018). However, we have also included ordered logistic models expecting non-linear
relationships and as the dependent variable is categorical rating (Chatterjee & Mandal, 2020). We analyzed the data as a whole and sub-category wise. For overall analysis, we only used the sentiment and emotion scores from the whole text and the title sentiment. For the sub-category wise analysis, we considered attribute-wise sentiments along with
the variables as described above.
The result of the overall analysis suggests that the sentiment of the
title and the body best reflects consumer satisfaction. Among emotions,
anger and fear have very strong negative effect on satisfaction, while joy
has strong positive effects. Uncertain emotions such as anticipation and
surprise, though are positive in valence, have negative relationship with
overall satisfaction. The effect of other emotions, though statistically significant, are very small. The result supports H2a and H2b.
While we try to compare the sub-categories, the above-mentioned
impact of overall sentiment of the title and the body along with the emotions holds true, thus further supporting H2a and H2b. Some emotions
specifically associated with some sub-categories include disgust with
drugs and pharmacy, eye-treatment and beaut-wellness; sadness with
skincare and eye-treatment. In terms of the ecommerce attributes, product assortment, services available and delivery are found to be most important for most of the sub-categories. Time aspects are most important
for eye-treatment along with drugs and pharmacy. Price is important for
cosmetics, beauty and wellness along with drugs and pharmacy; however as per the order-logistic regression it has not relationship with customer satisfaction for any category. Equipment and facility is important
for fitness, drugs and skincare subcategories. The above results suggests
that aspect wise sentiment can reflect customer satisfaction, thus supporting H1. However, the relative importance of overall sentiment, aspect wise sentiments and emotions expressed in the textual review vary
depending on the type of healthcare/health-product ecommerce, which
supports H3.
The models were free from multi-collinearity and heteroscedasticity
issues. Table 2 summarizes the models.
5. Discussions
UN
CO
RR
EC
TE
D
CSAT, which is an attitudinal aspect of consumer outcome, is of
paramount importance when it comes to organizations looking to use
a metric for assessing both consumer outcomes (Chang, 2015; Söderlund, 1998). Herein, based on user-generated information, we look
to elaborate upon the antecedents of CSAT, specifically in healthcare/
health-product ecommerce. We used textual qualitative reviews, and
through text mining along with natural language processing techniques,
we have attempted to derive insights from them. Sentiments and emotions expressed in a textual review for the overall service include our
first salient finding. Moreover, we have also found the sentiments that
have been expressed under specific service attributes, basing ourselves
on keywords and bag of words. By and large, these insights form part
of qualitative reviews, through which we have looked at explaining consumer outcomes. We also explored how the relationships as explained
above, tend to vary over the multitude of business models.
Based on the regression results and the results obtained from feature importance scores, we can conclude that both the C&A service
attributes do play a very important role when it comes to the ‘types
of ecommerce firms’, especially in terms of reflecting and predicting
CSAT. The above can be explained using MPAA model where consumer uses various pathways while building attitude (Cohen & Reed,
2006). This includes both personal characteristics of consumers (inside-out) and multiple attributes of services (outside-in). Therefore, a
combination of multiple internal and external forces lead to the customer satisfaction, as supported in the results. The core attributes of
ecommerce i.e., product assortment, services available and delivery, are
most important. This is expected as per AD model, as the core attributes are more accessible and diagnostic (Vaidyanathan, 2000). The
relative importance of product is higher for beauty and wellness as well
as cosmetics, while the same for services is higher for fitness and nutrition. This is in expected lines, because the sub-category of beauty
and cosmetics is heavily product centric, whereby it is the product performance and product quality that essentially lead to CSAT. On the
other hand, service success in fitness and nutrition often depend on
consumers’ motivations and discipline, which may be improved by services provided by ecommerce firms. Thus, service plays an important
role in nutrition and fitness. Among the augmented aspects, time gets
higher importance for eye-treatment along with drugs and pharmacy;
for these two sub-categories, on-time delivery and on-time service is crucial. Price does have some importance for cosmetics, beauty and wellness along with drugs and pharmacy, as often such sub-categories are
dependent on multiple and regular usage of products. Equipment and fa
4.3. Predictive models
We also used the predictive power of overall sentiment and emotions scores, and aspect-wise sentiment score in predicting overall satisfaction. We have used 100 fold validation method to check the outsample validity of the predictive models. For analysis we have used Linear
Regression, XGboost, Random Forest and Decision tree (CART) as the
methods. All these methods are commonly used for comparative analysis of the explaining power of machine learning models. As per Table
3, while XGboost and Random Forest show better predictive power in
terms of lower root mean square error (RMSE), the linear regression
model performs almost equally well. Thus, we can use linear regression
model for predictive analysis too; the advantage being that the regression model is theoretically explainable.
6
Table 2
Explanatory Models.
Model
Regression
Variables
Overall
Nutrition and
Fitness
Beauty and
Wellness
Drugs and
Pharmacy
Skincare
Eyetreatment
Cosmetics
AdjR2
AIC
(Intercept)
Overall Sentiment
Title
Overall Sentiment
Body
Anger
Anticipation
Disgust
0.2913
0.3451
0.2843
0.2234
0.3647
0.3351
0.4056
4.01***
0.63***
3.76***
0.79***
3.99***
0.62***
4.25***
0.44***
4.11***
0.5***
3.68***
0.71***
3.86***
0.67***
0.86***
0.95***
0.79***
0.58***
0.67***
1.03***
0.89***
Fear
Joy
Sadness
Surprise
Trust
service
product
delivery
price
time
facility
equipment
1|2
2|3
3|4
4|5
F
O
O
R
P
Ordered Logistic Regression
Overall
Nutrition and
Fitness
Beauty and
Wellness
Drugs and
Pharmacy
Skincare
Eyetreatment
Cosmetics
252,654
62,557
47,562
18,218
15,746
15,499
19,227
1.53***
1.62***
1.60***
1.32***
1.26***
1.57***
1.46***
2.83***
2.90***
2.75***
2.44***
2.33***
2.77***
2.67***
−0.23***
−0.13***
−0.20*
−0.49***
−0.14***
−0.09
(NS)
0.05 (NS)
D
E
T
C
E
R
R
O
C
N
U
−0.28***
−0.06***
−0.08***
−0.29***
−0.07***
−0.05***
−0.29***
−0.07***
0.03*
−0.2***
−0.06 ***
−0.19 ***
−0.25***
−0.08***
−0.1***
−0.14***
−0.05***
−0.27***
−0.33***
−0.04***
−0.11***
−0.41***
−0.16***
−0.03**
−0.43***
−0.13***
0.02 (NS)
−0.46***
−0.18***
−0.12***
−0.23**
−0.22***
−0.26**
−0.13***
−0.15***
−0.09***
−0.15 ***
0.04*
−0.04*
−0.25***
−0.25***
−0.19***
−0.28***
0.12***
−0.08***
−0.08***
−0.01
(NS)
0.12***
−0.03***
−0.06***
0.03***
0.12***
−0.03***
−0.08***
−0.01 (NS)
0.1***
−0.06 ***
−0.11 ***
−0.01 (NS)
0.17***
−0.04*
−0.19***
−0.04**
0.31***
−0.05(NS)
−0.31***
−0.01 (NS)
0.36***
−0.11*
−0.47***
−0.00 (NS)
0.11***
0.09***
0.13**
0.11***
−0.07***
−0.08***
−0.01
(NS)
0.14***
0.26***
−0.13***
−0.26***
−0.01(NS)
0.07***
0.13***
−0.18***
−0.09***
−0.01
(NS)
0.16***
−0.02
(NS)
0.14***
−0.12***
−0.12***
0.03*
0.33***
0.02 (NS)
0.51*
−0.40***
−0.24***
−0.08
(NS)
0.07
(NS)
0.32***
−0.27***
−0.34***
0.03
(NS)
0.62**
0.02 (NS)
0.14***
0.12***
0.12***
0.11*
0.12***
0.00 (NS)
0.27**
0.20 (NS)
0.32*
0.2***
0.15***
0.09*
0.31***
0.15***
0.17*
0.11 (NS)
−0.02 (NS)
0.74**
−0.02 (NS)
0.11***
0.08*
0.11
(NS)
−0.02
(NS)
−0.04
(NS)
−0.01
(NS)
0.01 (NS)
0.18***
−0.27**
0.00 (NS)
0.08 (NS)
0.04 (NS)
0.28***
0.05 (NS)
0.25***
NS = Not significant, * means p < 0.05 and *** are p < 0.0001. −0.04 (NS) 1.03*** 1.37 (NS) −0.55*** −0.43 (NS) −0.36 (NS) −0.51* −2.32*** −1.69*** −0.98*** 0.01 (NS) −2.76*** −2.19*** −1.51*** −0.48*** −2.43*** −1.98*** −1.37*** −0.33*** 1.02*** −0.27** −2.37*** −1.72*** −1.05*** −0.08*** 0.73** −1.92*** −1.37*** −0.69*** 0.31*** −0.0 (NS) 0.33*** −0.14* −0.48*** −0.02 (NS) −0.01 (NS) −0.26 (NS) −0.11 (NS) −0.17 (NS) 3.37*** 0.23*** −0.11* −0.22** −0.00 (NS) 0.33* −1.94*** −1.26*** −0.53*** 0.43*** −2.18*** −1.55*** −0.85*** 0.09* 0.17 (NS) 0.19 (NS) 0.38 (NS) S. Chatterjee et al. Journal of Business Research xxx (xxxx) xxx-xxx Table 3 RMSE scores of predictive models. Drugs & Pharmacy Eye Treatment Skincare Cosmetic 1.03 0.92 0.97 1.19 0.92 0.92 0.92 0.92 0.82 0.78 0.82 0.99 1.01 0.93 0.99 1.18 0.91 0.83 0.87 1.06 0.99 0.9 0.95 1.14 PR OO F Beauty & wellness UN CO RR EC TE D Linear Regression Xgboost Random Forest Decision tree Fitness & Nutrition Fig 2. Average feature importance of emotions. Fig 3. Average feature importance of service aspects. cility on the other hand, is important only for nutrition and fitness sub-category, as they often work in an omni-channel mode, where brick and mortar facilities and online ecommerce work conjointly. Thus, we conclude that the feature importance of consumer sentiments towards C&A service attributes vary depending on the type of healthcare/health-product ecommerce. This finding is in line with previous researchers who focus on the relative importance of core vs. augmented service aspects (Byrd, Canziani, Hsieh, Debbage, & Sonmez, 2016; Ravald & Grönroos, 1996). We affirm the explanatory and predictive power of consumer emotions based on the results of the regression models and predictive models. A consumer’s overall sentiment and title sentiment can reflect and predict his/her satisfaction the most. Further, it isn’t surprising therefore that the most important emotions are higher arousal- anger, fear, disgust and joy. This can be explained by the PA model which suggests 8 S. Chatterjee et al. Journal of Business Research xxx (xxxx) xxx-xxx comes is related to customer satisfaction. The relationship of experiential emotions and customer satisfaction is supported by PA model (Ladhari, 2007). Thus our study strengthens the above model. Our third theoretical contribution is that we explored how the comparative importance of various service attribute-wise qualitative evaluations differ based on consumer outcomes in the healthcare service context (i.e. the subcategories of healthcare and wellbeing). Additionally, we also found that the importance of overall textual sentiment and textual emotions actually change while trying to reflect CSAT under the healthcare service context. True that past studies have dealt on how the service context impacts consumer evaluations (Ekinci & Riley, 2003; Matzler et al., 2008; Xu, 2020; Xu, Benbasat, & Cenfetelli, 2013), but when it comes to the healthcare/health-product ecommerce context, they’re almost non-existent. The health oriented ecommerce industry also consists of various types of ecommerce, including generic ecommerce and more niche ecommerce. This varying context will lead to differential relative importance of C&A services vis a vis the consumer emotions in reflecting and predicting consumer. Specific knowledge of such feature importance would help managers to create marketing plans focused to their own industry, while helping them manage customer reviews better. Therefore, this is a pioneering study also in the context of healthcare CSAT literature. We found that both core and augmented service attributes play crucial roles in reflecting and predicting CSAT in the healthcare/ health-product ecommerce context. The above finding strengthens the MPAA model with additional evidences of validity of the model (Cohen & Reed, 2006). As per MPAA model, attitude formation happens based on inside-out and out-side in pathways and multiple internal (sentiment and emotions) and external variables (evaluations of service aspects) contribute towards satisfaction building. The current study ensures the same and contributes towards literature on usage of MPAA model in consumer behavior (Hasford & Farmer, 2016; Lynch, 2006). PR OO F that the consumption emotions with higher arousal are more related to satisfaction (Ladhari, 2007). Specifically, disgust is more important in the sub-categories of drugs and pharmacy along with eye-treatment, while anger for cosmetics. This is in line with consumer identity literature related with cosmetics usage (Fabricant & Gould, 1993). Consumers of cosmetics ecommerce often have external locus on identity i.e., they look for social acknowledgement. Therefore the purchase context is psychologically distant, and any service failure in such a context, could lead to high arousal negative emotions such as anger (Davis, Gross, & Ochsner, 2011; Tatavarthy, Chatterjee, & Sharma, 2019). On the other hand, low arousal emotions such as anticipation, sadness and trust are of less importance, as supported by PA model (Ladhari, 2007). Sadness does have some importance in the context of skincare category, possibly because skincare is often a psychologically close context, related to a consumer’s identity (Lazar, 2011). Therefore, any service failure in skincare ecommerce may lead to low arousal negative emotions such as sadness (Davis et al., 2011). Based on the results, we conclude that consumer sentiments and emotions can and do reflect as well as predict CSAT in healthcare/health-product ecommerce industry. However, the feature importance of consumer emotions do tend to vary, depending on the type of healthcare/health-product ecommerce (del Bosque & San Martín, 2008). UN CO RR EC TE D 5.1. Theoretical and methodological contribution The paper has a number of theoretical and methodological contributions. Our first theoretical contribution is that the paper is a pioneering effort at exploring how the qualitative evaluations of the service aspects relate with CSAT, especially in the healthcare context (Siering et al., 2018). The importance of textual reviews has been extensively highlighted in extant literature thus far, (Brill, 1995; Hotho et al., 2005); however, what has remained unexplored is the method in combining both qualitative and quantitative data (Siering et al., 2018). Though the contribution in healthcare context is more applied in nature instead of core theoretical contribution, nuances in healthcare and health-product ecommerce context is very important. Healthcare being a multi-faceted service context, the theoretical underpinnings of customer satisfaction can result in very different reflectors and predictors, as found in our study. Thus the current contribution is important and unique in the healthcare context. On the other hand, the above is also a methodological contribution towards the literature which focus on ecommerce satisfaction study. Studies of ecommerce satisfaction was majorly survey-based with latent constructs like value, trust and service quality as the antecedents (Oh, 1999; Pappas et al., 2014; Szymanski & Hise, 2000; Taylor and Baker, 1994; Wang et al., 2019; Zeithaml et al., 2002). While some studies also included user generated content to study e-commerce satisfaction and tris to find the influence of pre and post-purchase attributes, they majorly relied on quantitative data obtained from review websites (Dholakia & Zhao, 2010; Posselt & Gerstner, 2005; You et al., 2016). This is the pioneering study which focus on ecommerce satisfaction using both quantitative and qualitative information from the user generated content, thus contributing to extant literature (Dholakia & Zhao, 2010; Posselt & Gerstner, 2005; You et al., 2016). Our second important contribution lies in the fact that in extant literature dealing with textual reviews major importance have been given to overall or aspect-wise sentiments (Salehan & Kim, 2016; Siering et al., 2018; Ye et al., 2009); the usage of textual emotions scores has been limited thus far (Ahani et al., 2019; Wang et al., 2019). Through this study, we have found the emotions from textual reviews in order to explore how they relate to consumer outcomes (Salehan & Kim, 2016; Ye et al., 2009). While sentiments of various core and augmented attributes are expressed in the text, the emotions are often related to the results and experience of the usage of the ecommerce platform and the healthcare product/service. Thus by including the emotion elements in our study, we also explore how experiential out 5.2. Managerial implications As regards managerial implications: the primary implication lies in its service design, which is often nontrivial decision for healthcare/ health-product ecommerce firms, given that healthcare is often an amalgamation of multiple service aspects, whose importance may still be not known completely. We chose to focus both on C&A service aspects. The study that the former has higher importance than the latter. However, the relative importance of such aspects vary depending of healthcare/health-product ecommerce service contexts defined as the subcategories of the healthcare/health-product ecommerce industry. Therefore, the prioritization and resource allocation decisions during the service designing process should consider the above. When there’s a resource allocation problem, ecommerce firms can use the regression models as objective functions and take investment decision for each service aspect which will improve CSAT. Our study also gives a comparative analysis of the predictive models based on econometrics and machine learning and suggests that the econometric models work equally good in comparison to the most common machine learning models. Moreover, the information generated from the qualitative review can also be used to predict CSAT. Thus the study gives an automated system which can easily find the reflectors of CSAT in various service context giving suggestions where a healthcare/ health-product ecommerce firm should focus. Ecommerce firms crunch huge set of data and automated predictive models suggesting potential service designs and handling consumer reviews via automated review management systems are important. This methodology of predictive machine learning models which is clubbed with text mining can extract relevant information from the text automatically and can find the relative importance of such information in predictive customer satisfaction, thus giving important marketing information to the managers in a dynamic ever-changing world. 9 S. Chatterjee et al. Journal of Business Research xxx (xxxx) xxx-xxx Finally, the study suggested that both sentiment and emotions have explanatory power while reflecting CSAT. We further suggest such relative explanatory power varies depending on what type of healthcare/ health-product ecommerce we are studying. During service failures, ecommerce contexts that are more close to self-identity (such as skincare) will result in low arousal emotions such as sadness while ecommerce contexts that are more close to social-identity (such as cosmetics) will result in high arousal emotions such as anger. This understanding is important for healthcare/health-product ecommerce service managers in service design, more specifically service recovery and customer relationship management strategy. For instance, based on the above understanding skincare ecommerce firms will focus on ensuring low arousal positive emotions (trust) via their recovery strategy while cosmetics ecommerce will try to induce high arousal positive emotions (joy). Therefore, the communication content and the recovery measures should also be designed accordingly. PR OO F Asghar, M.Z., Khan, A., Ahmad, S., & Kundi, F.M. (2014). A review of feature extraction in sentiment analysis. Journal of Basic and Applied Scientific Research, 4(3), 181–186. Brill, E. (1995). Transformation-based error-driven learning and natural language processing: A case study in part-of-speech tagging. Computational linguistics, 21(4), 543–565. Byrd, E.T., Canziani, B., Hsieh, Y.C.J., Debbage, K., & Sonmez, S. (2016). Wine tourism: Motivating visitors through core and supplementary services. Tourism Management, 52, 19–29. doi:10.1016/j.tourman.2015.06.009. Cavanaugh, L.A., MacInnis, D.J., & Weiss, A.M. (2016). Perceptual dimensions differentiate emotions. Cognition and Emotion, 30(8), 1430–1445. doi:10.1080/ 02699931.2015.1070119. Chang, K.C. (2015). How travel agency reputation creates recommendation behavior. Industrial Management & Data Systems, 115(2), 332–352. doi:10.1080/ 15256480802557283. Chatterjee, S. (2019). Explaining customer ratings and recommendations by combining qualitative and quantitative user generated contents. Decision Support Systems, 119, 14–22. doi:10.1016/j.dss.2019.02.008. Chatterjee, S., & Mandal, P. (2020). Traveler preferences from online reviews: Role of travel goals, class and culture. Tourism Management, 80, 104108. doi:10.1016/ j.tourman.2020.104108. Cheung, C.M., & Lee, M.K. (2012). What drives consumers to spread electronic word of mouth in online consumer-opinion platforms. Decision Support Systems, 53(1), 218–225. doi:10.1016/j.dss.2012.01.015. Chevalier, J.A., & Mayzlin, D. (2006). The effect of word of mouth on sales: Online book reviews. Journal of Marketing Research, 43(3), 345–354. doi:10.1509/jmkr.43.3.345. Cohen, J.B., & Reed, A. (2006). A multiple pathway anchoring and adjustment (MPAA) model of attitude generation and recruitment. Journal of Consumer Research, 33(1), 1–15. doi:10.1086/504121. Dang, Y., Zhang, Y., & Chen, H. (2010). A lexicon-enhanced method for sentiment classification: An experiment on online product reviews. IEEE Intelligent Systems, 25(4), 46–53. doi:10.1109/MIS.2009.105. Davis, J.I., Gross, J.J., & Ochsner, K.N. (2011). Psychological distance and emotional experience: What you see is what you get. Emotion, 11(2), 438. doi:10.1037/ a0021783. del Bosque, I.R., & San Martín, H. (2008). Tourist satisfaction a cognitive-affective model. Annals of Tourism Research, 35(2), 551–573. doi:10.1016/j.annals.2008.02.006. Dholakia, R.R., & Zhao, M. (2010). Effects of online store attributes on customer satisfaction and repurchase intentions. International Journal of Retail & Distribution Management, 38(7), 482–496. doi:10.1108/09590551011052098. Duan, W., Gu, B., & Whinston, A.B. (2008). The dynamics of online word-of-mouth and product sales—An empirical investigation of the movie industry. Journal of Retailing, 84(2), 233–242. doi:10.1016/j.jretai.2008.04.005. Ekinci, Y., & Riley, M. (2003). An investigation of self-concept: Actual and ideal self-congruence compared in the context of service evaluation. Journal of Retailing and Consumer Services, 10(4), 201–214. doi:10.1016/S0969-6989(02)00008-5. Fabricant, S.M., & Gould, S.J. (1993). Women’s makeup careers: An interpretive study of color cosmetic use and “face value”. Psychology & Marketing, 10(6), 531–548. doi:10.1002/mar.4220100606. Feldman, R., & Sanger, J. (2007). The text mining handbook: Advanced approaches in analyzing unstructured data. Cambridge University Press. Feldman, R. (2013). Techniques and applications for sentiment analysis. Communications of the ACM, 56(4), 82–89. doi:10.1145/2436256.2436274. Filieri, R. (2016). What makes an online consumer review trustworthy? Annals of Tourism Research, 58, 46–64. doi:10.1016/j.annals.2015.12.019. Garbarino, E., & Johnson, M.S. (1999). The different roles of satisfaction, trust, and commitment in customer relationships. Journal of Marketing, 63(2), 70–87. doi:10.1177/002224299906300205. Grewal, R., Chandrashekaran, M., & Citrin, A.V. (2010). Customer satisfaction heterogeneity and shareholder value. Journal of Marketing Research, 47(4), 612–626. doi:10.1509/jmkr.47.4.612. Hasford, J., & Farmer, A. (2016). Responsible you, despicable me: Contrasting competitor inferences from socially responsible behavior. Journal of Business Research, 69(3), 1234–1241. doi:10.1016/j.jbusres.2015.09.009. Hennig-Thurau, T., Gwinner, K.P., Walsh, G., & Gremler, D.D. (2004). Electronic word-of-mouth via consumer-opinion platforms: What motivates consumers to articulate themselves on the internet? Journal of Interactive Marketing, 18(1), 38–52. doi:10.1002/dir.10073. Ho-Dac, N.N., Carson, S.J., & Moore, W.L. (2013). The effects of positive and negative online customer reviews: Do brand strength and category maturity matter? Journal of Marketing, 77(6), 37–53. doi:10.1509/jm.11.0011. Hotho, A., Nürnberger, A., & Paaß, G. (2005). A brief survey of text mining. Ldv Forum, 20(1), 19–62. Ladhari, R. (2007). The effect of consumption emotions on satisfaction and word-of-mouth communications. Psychology & Marketing, 24(12), 1085–1108. doi:10.1002/ mar.20195. Laros, F.J., & Steenkamp, J.B.E. (2005). Emotions in consumer behavior: A hierarchical approach. Journal of Business Research, 58(10), 1437–1445. doi:10.1016/ j.jbusres.2003.09.013. Lynch, J.G., Jr. (2006). Accessibility-diagnosticity and the multiple pathway anchoring and adjustment model. Journal of Consumer Research, 33(1), 25–27. doi:10.1086/ 504129. Lazar, M.M. (2011). The right to be beautiful: Postfeminist identity and consumer beauty advertising. New femininities (pp. 37–51). London: Palgrave Macmillan. 5.3. Limitations and future scope UN CO RR EC TE D We have not studied the psychological mechanism that creates consumer attitude based on the sentiments and emotions felt by the consumer. Future research can bring in textual reviews in psychological experiments to give better clarity on this aspect. How such mechanism can lead to differential importance of different C&A attributes along with different sentiment and emotions in various healthcare/health-product ecommerce context should also be explored. Other variables, such as cultural and socio economic background of the consumers may also have an impact on the above mechanism. We could not study the same due to lack of data which can be studied by future researchers. The results can also be expanded in other healthcare/health-product ecommerce contexts, more so in Omni-channel contexts which is not done in the current study and can be explored in future. Possible bandwagon behavior in terms of providing incongruous online reviews can also be explored in the context of healthcare (Cheung & Lee, 2012). The bandwagon effect in the context of a healthcare product such as cosmetics may be high but such effect may not be present in eye-care, depending on how sensitive eye-care is to a customer in comparison to cosmetics. Future researchers can focus on the same. While using online reviews makes the data collection and information generation easier, one must keep in mind often the online reviews may not be a true representative of the customer sample. The demographic and psychographics of the consumers do drive their willingness to put reviews on online review channels (Manner, 2017). Therefore, while large dataset reduce the impact of bias, as is the case in our study, future researchers may try to overcome this limitation by including multiple review channels or combining both survey based and online review based findings. Appendix A. Supplementary material Supplementary data to this article can be found online at https://doi. org/10.1016/j.jbusres.2020.10.043. References Ahani, A., Nilashi, M., Yadegaridehkordi, E., Sanzogni, L., Tarik, A.R., Knox, K., … Ibrahim, O. (2019). Revealing customers’ satisfaction and preferences through online review analysis: The case of Canary Islands hotels. Journal of Retailing & Consumer Services, 51, 331–343. doi:10.1016/j.jretconser.2019.06.014. Ajorlou, A., Jadbabaie, A., & Kakhbod, A. (2016). Dynamic pricing in social networks: The word-of-mouth effect. Management Science, 64(2), 971–979. doi:10.1287/ mnsc.2016.2657. Anderson, E.W., Fornell, C., & Lehmann, D.R. (1994). Customer satisfaction, market share, and profitability: Findings from Sweden. The Journal of marketing, 53–66. doi:10.2307/1252310. Anderson, E.W., & Sullivan, M.W. (1993). The antecedents and consequences of customer satisfaction for firms. Marketing Science, 12(2), 125–143. doi:10.1287/ mksc.12.2.125. 10 S. Chatterjee et al. Journal of Business Research xxx (xxxx) xxx-xxx Taboada, M., Brooke, J., Tofiloski, M., Voll, K., & Stede, M. (2011). Lexicon-based methods for sentiment analysis. Computational Linguistics, 37(2), 267–307. doi:10.1162/ COLI_a_00049. Vaidyanathan, R. (2000). The role of brand familiarity in internal reference price formation: An accessibility-diagnosticity perspective. Journal of Business and Psychology, 14(4), 605–624. doi:10.1023/A:1022942330911. Vijayarani, S., Ilamathi, M.J., & Nithya, M. (2015). Preprocessing techniques for text mining-an overview. International Journal of Computer Science & Communication Networks, 5(1), 7–16. Wang, W.M., Tian, Z.G., Li, Z., Wang, J.W., Vatankhah Barenji, A., & Cheng, M.N. (2019). Supporting the construction of affective product taxonomies from online customer reviews: An affective-semantic approach. Journal of Engineering Design, 30(10–12), 445–476. doi:10.1080/09544828.2019.1642460. Westbrook, R.A., & Oliver, R.L. (1991). The dimensionality of consumption emotion patterns and consumer satisfaction. Journal of consumer research, 18(1), 84–91. doi:10.1086/209243. Xu, J.D., Benbasat, I., & Cenfetelli, R.T. (2013). Integrating service quality with system and information quality: An empirical test in the e-service context. MIS Quarterly, 37(3), 777–794. doi:10.25300/MISQ/2013/37.3.05. Xu, X. (2020). Examining an asymmetric effect between online customer reviews emphasis and overall satisfaction determinants. Journal of Business Research, 106, 196–210. doi:10.1016/j.jbusres.2018.07.022. Ye, Q., Zhang, Z., & Law, R. (2009). Sentiment classification of online reviews to travel destinations by supervised machine learning approaches. Expert systems with applications, 36(3), 6527–6535. doi:10.1016/j.eswa.2008.07.035. You, Y., Bhatnagar, A., & Ghose, S. (2016). Customer satisfaction with E-Retailers: The role of product type in the relative importance of attributes. Journal of Internet Commerce, 15(3), 274–291. doi:10.1080/15332861.2016.1212314. Zeithaml, V.A., Parasuraman, A., & Malhotra, A. (2002). Service quality delivery through web sites: A critical review of extant knowledge. Journal of the Academy of Marketing Science, 30(Fall), 362–375. UN CO RR EC TE D PR OO F Manner, C.K. (2017). Who posts online customer reviews? The role of sociodemographics and personality traits. Journal of Consumer Satisfaction, Dissatisfaction and Complaining Behavior, 30, 23. Martensen, A., Gronholdt, L., & Kristensen, K. (2000). The drivers of customer satisfaction and loyalty: Cross-industry findings from Denmark. Total Quality Management, 11(4–6), 544–553. doi:10.1080/09544120050007878. Mohammad, S.M., & Turney, P.D. (2013). Crowdsourcing a word–emotion association lexicon. Computational Intelligence, 29(3), 436–465. doi:10.1111/ j.1467-8640.2012.00460.x. Mostafa, M.M. (2013). More than words: Social networks’ text mining for consumer brand sentiments. Expert Systems with Applications, 40(10), 4241–4251. doi:10.1016/ j.eswa.2013.01.019. Mouwen, A. (2015). Drivers of customer satisfaction with public transport services. Transportation Research Part A: Policy and Practice, 78, 1–20. doi:10.1016/ j.tra.2015.05.005. Mowen, J.C., Park, S., & Zablah, A. (2007). Toward a theory of motivation and personality with application to word-of-mouth communications. Journal of Business Research, 60(6), 590–596. doi:10.1016/j.jbusres.2006.06.007. Matzler, K., Füller, J., Renzl, B., Herting, S., & Späth, S. (2008). Customer satisfaction with Alpine ski areas: The moderating effects of personal, situational, and product factors. Journal of Travel Research, 46(4), 403–413. doi:10.1177/0047287507312401. Ng, J.H., & Luk, B.H. (2019). Patient satisfaction: Concept analysis in the healthcare context. Patient Education and Counseling, 102(4), 790–796. doi:10.1016/ j.pec.2018.11.013. Nguyen, D.Q., Nguyen, D.Q., Pham, D.D., & Pham, S.B. (2016). A robust transformation-based learning approach using ripple down rules for part-of-speech tagging. AI Communications, 29(3), 409–422. doi:10.3233/AIC-150698. Oh, H. (1999). Service quality, customer satisfaction, and customer value: A holistic perspective. International Journal of Hospitality Management, 18(1), 67–82. doi:10.1016/S0278-4319(98)00047-4. Pappas, I.O., Pateli, A.G., Giannakos, M.N., & Chrissikopoulos, V. (2014). Moderating effects of online shopping experience on customer satisfaction and repurchase intentions. International Journal of Retail & Distribution Management, 42(3), 187–204. doi:10.1108/IJRDM-03-2012-0034. Park, J.H., Gu, B., Leung, A.C.M., & Konana, P. (2014). An investigation of information sharing and seeking behaviors in online investment communities. Computers in Human Behavior, 31, 1–12. doi:10.1016/j.chb.2013.10.002. Popescu, A.M., & Etzioni, O. (2007). Extracting product features and opinions from reviews. Natural language processing and text mining (pp. 9–28). London: Springer. Posselt, T., & Gerstner, E. (2005). Pre-sale vs. post-sale e-satisfaction: Impact on repurchase intention and overall satisfaction. Journal of Interactive Marketing, 19(4), 35–47. doi:10.1002/dir.20048. Ravald, A., & Grönroos, C. (1996). The value concept and relationship marketing. European Journal of Marketing, 30(2), 19–30. doi:10.1108/03090569610106626. Salehan, M., & Kim, D.J. (2016). Predicting the performance of online consumer reviews: A sentiment mining approach to big data analytics. Decision Support Systems, 81, 30–40. doi:10.1016/j.dss.2015.10.006. Salton, G., & Buckley, C. (1988). Term-weighting approaches in automatic text retrieval. Information Processing & Management, 24(5), 513–523. doi:10.1016/ 0306-4573(88)90021-0. Sandars, J., & Walsh, K. (2009). The use of online word of mouth opinion in online learning: A questionnaire survey. Medical Teacher, 31(4), 325–327. doi:10.1080/ 01421590802204403. Sharp, J. (2011). Brand awareness and engagement: A case study in healthcare social media. Frontiers of Health Services Management, 28(2), 29–33. Siering, M., Deokar, A.V., & Janze, C. (2018). Disentangling consumer recommendations: Explaining and predicting airline recommendations based on online reviews. Decision Support Systems, 107, 52–63. doi:10.1016/j.dss.2018.01.002. Söderlund, M. (1998). Customer satisfaction and its consequences on customer behaviour revisited: The impact of different levels of satisfaction on word-of-mouth, feedback to the supplier and loyalty. International journal of service industry management, 9(2), 169–188. doi:10.1108/09564239810210532. Szymanski, D.M., & Hise, R.T. (2000). E-satisfaction: An initial examination. Journal of Retailing, 76(3), 309–322. doi:10.1016/S0022-4359(00)00035-X. Tatavarthy, A.D., Chatterjee, S., & Sharma, P. (2019). Exploring the moderating role of construal levels on the impact of process versus outcome attributes on service evaluations. Journal of Service Theory and Practice, 30, 1–40. doi:10.1108/ JSTP-10-2018-0229. Biography Dr. Swagato Chatterjee is a researcher, consultant, teacher and academician. He has over 7 years of experience in marketing, operations and analytics. He has worked with companies like Coca Cola, Times of India, Technosoft, Mitsubishi, Nomura, Yes Bank, CSC, Ernst and Young, Genpact in various consultancy and training assignments related to analytics. He has published in reputed international journals such as Decision Support Systems, Tourism Management, International Journal of Hospitality Management, Journal of Business and Industrial Marketing, Journal of Consumer Marketing, Journal of Strategic Marketing, Journal of Indian Business Research, Global Business Review among others and presented in various national and international conferences. He is a BTech from IIT Kharagpur and a PhD in marketing from IIM Bangalore. Currently he is an Assistant Professor in Vinod Gupta School of Management, IIT Kharagpur in the area of marketing and analytics. Divesh Goyal is currently a student at IIT Kharagpur pursuing his B.Tech. in Metallurgical and Materials Engineering and M.Tech. in Entrepreneurship Engineering. His interests lie in the field of finance, marketing, and analytics. He has experiences of working in a startup, as well as a globally recognized B-School, IIM Udaipur. He is looking forward to working in the field of Big Data and Artificial Intelligence and wants to be an entrepreneur. Atul Prakash is pursuing his M.Sc. in economics in the Department of Humanities and Social Sciences at IIT Kharagpur. His research interests lie in the domain of microeconomics, behavioural finance, trade, and analytics. He has first-hand experience in the field of data analytics which includes projects developing prediction and forecasting models. He looks forward to exploring the applications of modern analytic tools and methodologies in trade and finance. Jiwan is an undergraduate at IIT Kharagpur pursuing his B.Tech. in Agricultural & Food Engineering and M.Tech in Financial Engineering. His interest lies in Portfolio Optimization, Quantitative Finance, and analytics. He has worked on projects involving risk modelling, time series forecasting, and big data analytics. He is looking forward to exploring the applications of machine learning in marketing and finance. 11 View publication stats Journal of Hospitality Marketing & Management ISSN: 1936-8623 (Print) 1936-8631 (Online) Journal homepage: http://www.tandfonline.com/loi/whmm20 Understanding Satisfied and Dissatisfied Hotel Customers: Text Mining of Online Hotel Reviews Katerina Berezina, Anil Bilgihan, Cihan Cobanoglu & Fevzi Okumus To cite this article: Katerina Berezina, Anil Bilgihan, Cihan Cobanoglu & Fevzi Okumus (2015): Understanding Satisfied and Dissatisfied Hotel Customers: Text Mining of Online Hotel Reviews, Journal of Hospitality Marketing & Management, DOI: 10.1080/19368623.2015.983631 To link to this article: http://dx.doi.org/10.1080/19368623.2015.983631 Accepted online: 27 Feb 2015.Published online: 27 Feb 2015. Submit your article to this journal Article views: 190 View related articles View Crossmark data Full Terms & Conditions of access and use can be found at http://www.tandfonline.com/action/journalInformation?journalCode=whmm20 Download by: [Universite Laval] Date: 24 September 2015, At: 21:57 Journal of Hospitality Marketing & Management, 00:1–24, 2015 Copyright © Taylor & Francis Group, LLC ISSN: 1936-8623 print/1936-8631 online DOI: 10.1080/19368623.2015.983631 Understanding Satisfied and Dissatisfied Hotel Customers: Text Mining of Online Hotel Reviews KATERINA BEREZINA Downloaded by [Universite Laval] at 21:57 24 September 2015 College of Hospitality and Technology Leadership, University of South Florida Sarasota–Manatee, Sarasota, Florida, USA ANIL BILGIHAN College of Business, Florida Atlantic University, Boca Raton, Florida, USA CIHAN COBANOGLU College of Hospitality and Technology Leadership, University of South Florida Sarasota–Manatee, Sarasota, Florida, USA FEVZI OKUMUS Rosen College of Hospitality Management, University of Central Florida, Orlando, Florida, USA This article aims to examine the underpinnings of satisfied and unsatisfied hotel customers. A text-mining approach was followed and online reviews by satisfied and dissatisfied customers were compared. Online reviews of 2,510 hotel guests were collected from TripAdvisor.com for Sarasota, Florida. The research findings revealed some common categories that are used in both positive and negative reviews, including place of business (e.g., hotel, restaurant, and club), room, furnishing, members, and sports. Study results further indicate that satisfied customers who are willing to recommend a hotel to others refer to intangible aspects of their hotel stay, such as staff members, more often than unsatisfied customers. On the other hand, dissatisfied customers mention more frequently the tangible aspects of the hotel stay, such as furnishing and finances. The study offers clear theoretical and managerial implications pertaining to understanding of satisfied Address correspondence to Katerina Berezina, College of Hospitality and Technology Leadership, University of South Florida Sarasota–Manatee, 8350 N. Tamiami Trail, Sarasota, FL 34243, USA. E-mail: katerina@katerinaberezina.com Color versions of one or more of the figures in the article can be found online at www. tandfonline.com/whmm. 1 2 K. Berezina et al. and dissatisfied customers through the use of text mining and hotel ratings via review websites, social media, blogs, and other online platforms. KEYWORDS hotel reviews, text mining, user generated content, customer satisfaction, dissatisfaction Downloaded by [Universite Laval] at 21:57 24 September 2015 INTRODUCTION Hotels operate in a competitive and dynamic environment (Verma, Victorino, Karniouchina, & Feickert, 2007; Wilkins, 2010). The challenges of running a hotel business are identified by the fragmentation and complexity of the lodging industry (Okumus, Altinay, & Chathoth, 2010). Aside from this, increasing commoditization of hotel products makes it more difficult for hotel companies to compete for their customers. Starkov and Price (2007) suggested that customers select hotels based on the following criteria: familiarity, brand image, implementation of customer retention programs, and core offering or value of the hotel. Given this, it is important to understand what makes customers return or not return to a hotel, what makes them recommend a hotel to their friends and relatives or not recommend it, what image a property/brand has, and what features create value for customers. Hotels employ different tools to assess and address customer satisfaction and behavioral intentions. These tools may include placing comment cards in the guest rooms, employing service recovery techniques to address inhouse service failures, distributing postdeparture guest satisfaction surveys, and introducing follow-up measures for those problems that could not be resolved in-house. Even though hotels dedicate efforts to assess and recover (if necessary) customer satisfaction, the problem presents itself in guests’ unwillingness to share their experiences and provide feedback to hotels. Previous research suggests that the majority of customers do not act on the dissatisfactory service that they receive and are reluctant to complain to the service provider (Ekiz & Au, 2011; Ekiz, Khoo-Lattimore, & Memarzadeh, 2012). Such reluctance to complain and provide feedback to hotels may take away an opportunity to perform service recovery and improve the service level in hotels. At the same time, it is important to note that the Internet makes it easier for customers to share their experience via review websites, social media, blogs, and other online platforms. The abundance of customer reviews posted on the Internet is available not only to hotel managers, but also to other consumers who may base their purchasing decisions on the information provided online (Dickinger & Mazanec, 2008). An emerging dependence on the Internet as the source of information for decision-making regarding tourism products strengthens the need for more research in the electronic reviews area (Sparks & Browning, 2011). It is Downloaded by [Universite Laval] at 21:57 24 September 2015 Text Mining of Hotel Reviews 3 important for hotel managers to utilize customer review information that is available for them online in order to better understand their customers and improve hotel performance. However, the online medium generates such a large volume of information that it may be difficult for the managers to review and evaluate all of it. For this reason, this article undertakes the text-mining approach that allows for the extraction of meaningful patterns from large volumes of textual information (Lau, Lee, & Ho, 2005; Turban, Sharda, & Delen, 2010). Most of the previous studies rely on the overall ratings of hotels (e.g., Ramanathan & Ramanathan, 2011). This current research deploys customer recommendation, which is a stronger measure of customer experience. Detailed ratings carry more information about user preferences than single overall ratings alone (Jannach, Zanker, & Fuchs, 2014). Opinion mining captures the subjectivity in terms of the semantic orientation associated with the constituents of a text (Gräbner, Zanker, Fliedl, & Fuchs, 2012; Taboada, Brooke, Tofiloski, Voll, & Stede, 2011). In summary, this article aims to examine the underpinnings of satisfied and dissatisfied customers by applying the text-mining approach to the online hotel reviews. This will be achieved by comparing the online hotel reviews of satisfied customers who are willing to recommend the property to others and those of dissatisfied ones who do not recommend others to come to the property where they stayed. Study results should allow us to understand what aspects of amenities and services offered by hotels generate positive comments and what aspects generate negative ones. REVIEW OF LITERATURE Hotel Guest Satisfaction and Behavioral Intentions Identifying satisfied and dissatisfied customers has been an important research theme among scholars from various disciplines including engineering, management, marketing, and hospitality (Chow & Zhang, 2008; Pizam & Ellis, 1999). The concept of guest satisfaction and dissatisfaction has been comprehensively examined by marketing and consumer behavior researchers. These postpurchase behaviors are acknowledged as of great importance to the firms due to their influence on repeat purchases and word-of-mouth (WOM) recommendations. In a nutshell, satisfaction reinforces positive attitudes toward the brand and leads to a greater likelihood that the same brand will be purchased again. On the other hand, dissatisfaction may lead to negative brand attitudes and weaken the likelihood of buying the same brand again. One of the key approaches to answer the questions of customer satisfaction and potential future behavior is measuring service quality (Bharwani & Jauhari, 2013; Buttle, 1996; Crick & Spencer, 2011; Cronin & Taylor, 1992; Dortyol, Varinli, & Kitapci, 2014; Gummesson, 2014; Ladhari, 2012; Downloaded by [Universite Laval] at 21:57 24 September 2015 4 K. Berezina et al. Parasuraman, Zeithaml, & Berry, 1985; Prentice, 2013; Qu & Sit, 2007; Torres & Kline, 2013; Yee, Yeung, & Cheng, 2010). Service quality is a level of service delivery based on the customer perception (Zeithaml, Bitner, & Gremler, 2006). Perceived service quality is a part of a broader concept of customer satisfaction and behavioral intentions incorporating customer loyalty and WOM communications (Prasad, Wirtz, & Yu, 2014; Prentice, 2013). Hotel guests use a variety of elements to evaluate the quality of service that they receive during their stay (Pizam & Ellis, 1999; Wilkins, Merrilees, & Herington, 2007). Research indicates that customer satisfaction is affected by both tangible and intangible aspects of service quality (Ekinci, Dawes, & Massey, 2008; Prentice, 2013; Torres & Kline, 2013). The intangible elements are service related such as assurance, customer service and empathy whereas tangible elements are related to the physical facilities of the hotel such as appearance of hotel personnel and cleanliness of the room (Ramanathan & Ramanathan, 2011). It is claimed that service failure may have an impact on the perception of service quality, satisfaction and future behavioral intentions (Berezina, Cobanoglu, Miller, & Kwansa, 2012; Han & Back, 2007; Prentice, 2013; Tarn, 1999). Therefore, the recognition of attributes that enhance customer satisfaction and ensure customer loyalty is important for hotels. Hoteliers aim to make customers satisfied and keep them coming back to their properties. It is cheaper to keep an existing hotel guest than to invest in finding new customers (Tyrrell & Woods, 2005). Furthermore, research indicates that increasing customer retention rates by 5% may result in profit increase by 25% to 95% (Reichheld & Schefter, 2000). Gefen (2002) points out that acquiring new customers is more expensive than keeping loyal ones, while serving loyal customers is cheaper than serving new customers. Besides, loyal customers spend more and frequently refer new customers to a supplier, providing another rich source of profits (Bowen & Shoemaker, 1998; Shoemaker & Lewis, 1999). The growth and penetration of the Internet expand the effect of referrals from loyal customers. However, dissatisfied customers may also be valuable for hotels. First, they may assist hotels by pointing out the problematic areas of hotel operations that may require careful attention and improvement (Harrison-Walker, 2001). Another reason for appreciating dissatisfied customers is the effects of the service recovery paradox. The service recovery paradox states that the customer satisfaction rate is even higher for those customers who have experienced service failure followed by service recovery than for those customers who received their service properly on the first time (Harrison-Walker, 2001; Hoffman & Bateson, 2010; Zeithaml et al., 2006). Literature supports the fact that service recovery strategies increase customer loyalty (Cranage & Mattila, 2005). However, if complaints are not addressed, it may result in dissatisfaction, low repeat-purchase levels, and negative WOM (Mattila & Mount, 2003). In order to avoid such Downloaded by [Universite Laval] at 21:57 24 September 2015 Text Mining of Hotel Reviews 5 negative consequences, Harrison-Walker (2001) suggested that companies should embrace customer complaints for their own benefit. HarrisonWalker recommended that companies develop necessary outlets for customers to complain, including website resources, call centers, and chatting options. At the same time, negative WOM could be harmful to companies (Bambauer-Sachse & Mangold, 2011). Customers are inclined to specifically seek negative reviews because negative information is considered as being more diagnostic and informative than positive or neutral information. Negativity is weighted more heavily in the decision making process than positive information (Herr, Kardes, & Kim, 1991). Negative WOM could deter potential customers from considering a particular product or brand, therefore, damaging the company’s reputation and financial strength (Sundaram, Mitra, & Webster, 1998). It could also go viral very quickly in today’s connected world and possibly diminish brand equity and image, reduce sales, and, in extreme cases, close businesses completely. Evaluating Customer Satisfaction on Web 2.0 Traditionally hoteliers and academics assess service quality quantitatively by using guest comment cards and questionnaires. However, the development of the Internet and consumer-generated content provides a strong opportunity for a qualitative approach to service quality. The development of the Internet has led to the shaping of the second generation of the Internet, which is referred to as Web 2.0. It is an expression that was used for the first time by O’Reilly in 2004. O’Reilly (2005) defined Web 2.0 as “a set of economic, social, and technology trends that collectively form the basis for the next generation of the Internet—a more mature, distinctive medium characterized by user participation, openness, and network effects” (p.1). The technology that is referred to as the second-generation Internet (Web 2.0) is one that usually includes tools that allow people to collaborate and share information online. Examples of these include, but are not limited to, social networking, instant messaging, social bookmarking, mash-ups, blogs, virtual worlds, podcasts, web videos, and wikis (Kasavana, Nusair, & Teodosic, 2010). The most developed area of Web 2.0 within travel is consumer reviews (O’Connor, 2008: Litvin, Goldsmith, & Pan, 2008; Nusair, Bilgihan, & Okumus, 2013). The examples of travel review websites include websites such as Expedia and TripAdvisor. Pew Internet & American Life Project study (2006) reports that searching for travel related information is one of the most popular online activities. Research indicates that people utilize online travel referrals for travel planning (Cox, Burgess, Sellitto, & Buultjens, 2009; Mackay, McVetty, & Vogt, 2005; Litvin et al., 2008; Nusair, Bilgihan, Okumus, & Cobanoglu, 2013; Stringam & Gerdes, 2010). Furthermore, over Downloaded by [Universite Laval] at 21:57 24 September 2015 6 K. Berezina et al. 5 million travelers a month visit VirtualTourist.com in order to seek travel reviews and tips (Lee & Gretzel, 2006); approximately 20 million people visit TripAdvisor to read other travelers’ reviews every month (Yoo, Lee, & Gretzel, 2007). Recommendations provided by other consumers based on their tourism experiences are suggested to be not only the most preferred sources of travel information, but also the most influential sources for travel decision-making (Pan, MacLaurin, & Crotts, 2007). Online consumer reviews empower guests by allowing them to access “more accurate, up-to-date information about products” (Kucuk & Krishnamurthy, 2007). Aside from customers, management could also potentially benefit from online comments to report service strengths and weaknesses, making them of considerable utility when studying customer relationship management (Cho, Im, & Hiltz, 2003). User-generated content create opportunities for hotels to gain a better understanding of their guests (Barreda & Bilgihan, 2013). Literature suggests that hotel guest reviews are characterized by a growing importance and impact on the consumer decision-making process and hotel selection (Bulchand-Gidumal, Melián-González, & López-Valcárcel, 2011; O’Connor, 2008; Gretzel, & Yoo, 2008; Xie, Miao, Kuo, & Lee, 2011). Results of previous studies suggest that approximately 90% of travelers find hotel reviews to be helpful (Gretzel & Yoo, 2008; Stringam, Gerdes, & Vanleeuwen, 2010). According to the 2010 Portrait of American Travelers (YPartnership/ Harrison Group, 2010) the top preferred choices for finding travel information and prices include online travel agencies and review web sites such as Expedia (56%), Travelocity (52%), and Orbitz (46%). Product and service reviews are an increasingly important type of usergenerated content as they provide a valuable source of information to help customers make good purchasing decisions. Previous research reveals that the influence of user-generated online reviews on online sales is significant, with a 10% increase in traveler review ratings boosting online bookings by more than five percent (Ye, Law, Gu, & Chen, 2011). Predictions suggest that online reviews influence more than US$10 billion in online travel purchases annually (Compete, 2007). The sphere of consumer-generated content was studied by surveying Internet users and investigating their opinions about hotel reviews (Gretzel & Yoo, 2008). Stringam et al. (2010) conducted a study on hotel ratings that demonstrated the dominance of positive reviews in the online media (about 74% of the reviewers would recommend the property where they stayed to others). This study revealed a high positive correlation between service subcategory ratings, overall satisfaction, and intentions to recommend the hotel to others. Ekiz et al. (2012) investigated the online complaints in the luxury hotel context. They identified two main categories in online consumer complaints: room for improvement (physical attributes of the hotel room and the quality of the amenities provided in the room) and hotel staff attitudes (misbehaviors, bad attitude, lack of knowledge, skill, and passion of the staff). Downloaded by [Universite Laval] at 21:57 24 September 2015 Text Mining of Hotel Reviews 7 In the area of travel reviews, text mining has been utilized in order to classify pleasant reviews by satisfied customers and unpleasant reviews by dissatisfied customers (García-Barriocanal, Sicilia, & Korfiatis, 2010). These researchers utilize shallow natural language processing (NLP) in order to identify emotion-based review categories for reviews in Spanish. They suggest that hotel guest reviews can serve as a complementary source for hotel quality evaluation. Qualitative analysis of London hotels’ online reviews by O’Connor (2010) revealed the top 10 most common topics mentioned in the reviews to be the following: hotel location, room size, staff (good service), cleanliness, breakfast, in-room facilities, comfortable, temperature, dirty, and maintenance. Pekar and Ou (2008) deployed opinion mining and investigated the relationship between subjective expressions and references to hotel room features. However, they did not offer managerial implications from a services marketing aspect. Barreda and Bilgihan (2013) investigated the main themes that motivate guests to evaluate hotels on Web 2.0. Their findings indicate hotel cleanliness as a common concern in guests’ expectations. Guests were found to be more likely to write positive reviews for hotels that are conveniently located to attractions, shopping, airports, and restaurants. Guests were also positively influenced by the quality of service received. RESEARCH METHOD The purpose of this research is to identify the patterns in hotels reviews regarding the aspects that make hotel guests satisfy with the hotel and inspire them to recommend the property to others, and, on the other hand, to find out about the negative patterns that cause guest dissatisfaction. Text mining was chosen as a research method for the purpose of this research based on the premise that this approach is capable of finding out meaningful patterns in the vast amount of information generated by hotel guests’ reviews (Lau et al., 2005; Turban et al., 2010). Text mining “explores data in text files to establish valuable patterns and rules that indicate trends and significant features about specific topics” (Lau et al., 2005, p. 345). Sample and Data Sarasota, Florida, in the United States was selected as a location of primary focus for this article. Sarasota is a popular, vibrant, and fast-growing destination. It has been recognized with awards such as Orbitz.com’s “Top 10 Fastest Growing Domestic Beach Destinations” in 2008 and TripAdvisor Traveler’s Choice Award of 2011. Its popularity keeps growing with the number of visitors each year. Visit Sarasota County has recorded 759,800 visitors staying in paid accommodations in 2010; 827,000 visitors in 2011; followed Downloaded by [Universite Laval] at 21:57 24 September 2015 8 K. Berezina et al. by 894,100 and 941,400 in 2012 and 2013, respectively, the majority of which come for leisure purposes. This location was selected as a destination that offers a variety of travel experiences, including beach vacations, business and meetings, art and heritage tourism, leisure and sport activities, medical tourism, and ecotourism (Sarasota Convention and Visitors Bureau, 2009). Due to the variety of tourism types developed in Sarasota, the city also offers a wide selection of different hotel properties, such as leisure/beach hotels, resorts, business/conference hotels, limited service, select service, and full-service properties. The types of accommodations offered include 47% condos, 31% hotels/motels, 7% apartments, 7% houses, 4% mobile homes, 2% campsites. At the same time, Sarasota is a relatively small destination compared to other popular travel destinations in the United States. This allowed researchers to collect all available reviews for Sarasota hotels while conducting the study. All available online reviews for Sarasota hotels were collected from TripAdvisor.com. The TripAdvisor website enables travelers to access information about hotels, flights, restaurants, vacation rental, cruises, and other travel products. Users can post comments, share trip ideas/pictures, and express their reviews on hotels, restaurants, and destinations. TripAdvisor contains more than 100 million travel-related reviews from travelers from all over the world. These reviews cover more than 2.5 million businesses, 116,000 destinations, and 1.1 million accommodations (TripAdvisor, 2013). TripAdvisor was selected for this study, as it is one of the largest repositories of travel-related reviews. The data for this study was collected using an online robot developed for the purposes of this research. A total of 2,510 reviews were recorded in the excel file. The data file contains the following categories that present usual attributes of consumer reviews on TripAdvisor (see Table 1). A list of all hotels that were included in this study with corresponding star ratings, type of the property, and the number of reviews is presented in Table 2 below. The reviews that were included in the study were mainly (84.87%) provided by hotel guests traveling for leisure or leisure-related purposes (e.g., quality time with family, romantic getaway, personal event, etc.). Business travelers accounted for 11.70% of all reviewers. Table 3 provides information about travelers’ purpose of the trip. These statistics are in line with Sarasota’s leisure-dominated market composition (Sarasota Convention and Visitors Bureau, 2009). Internal Validity In relation to internal validity, it is crucial for this research to divide the reviews correctly into positive and negative categories. The researchers assumed consistency in customer opinion about the hotel (expressed via Text Mining of Hotel Reviews 9 TABLE 1 Hotel review categories Field Explanation Quote It contains a title of the guest review and in most cases the overall feeling about the hotel Name of the observed hotel Username of the reviewer Contributions contains the number of review posted by a particular user on the TripAdvisor.com Location refers to the reviewer’s residence Trip type includes different categories: business; couples; family; couples, family getaway; friends getaway; solo travel Contains the review body in it These fields contain numerical values that guests gave as rating scores to each of the categories named above. The values range from 1 (terrible) to 5 (excellent). Hotel name User name Contributions Downloaded by [Universite Laval] at 21:57 24 September 2015 Location Trip type Comment Value, rooms, location, cleanliness, service, and sleep quality Date of stay Visit type Travelers Age group Member since Recommendation The date that reviewer stayed in the hotel Visit type contains the following categories: business; hobbies/interest/culture; honeymoon; leisure; personal event; quality time with family; romantic getaway; and other. Refers to membership on TripAdvisor.com Recommendation contains categories “Yes” or “No” and represents likelihood of recommending this hotel to others. hotel ratings and comments) and their recommendations for other travelers (Yes/No). For the purpose of checking an internal validity of the reviews, correlations of rating scores and recommendation scores were obtained. The results are presented in Table 4. The analysis revealed strong significant positive correlations for all variables except location where the correlation was medium. After this, 20 reviews were randomly picked for the content analysis. The first two authors of this study read these reviews in order to double check if the content of those reviews really reflects the intention of the reviewer to recommend or not to recommend this hotel. The results of the internal validity check came out positive and the reviews were divided into two categories based on customer recommendations. Modeling and Word Categorization The text-mining approach using PASW Modeler was applied to the comment section of the document in order to identify patterns in guest comments about the hotel. The Text Analytics Module of PASW Modeler allows for conversion of unstructured data into a more structured one by means of extracting concepts and relationships found in textual information. Current research did not rely on the stance-shift analysis that considers syntax and 10 K. Berezina et al. TABLE 2 Hotel reviews included in the study Downloaded by [Universite Laval] at 21:57 24 September 2015 Hotel name Lido Beach Resort The Ritz-Carlton, Sarasota Helmsley Sandcastle Hotel Hyatt Regency Sarasota Holiday Inn Sarasota–Lido Beach Southland Inn Hotel Ranola Hotel Indigo Sarasota Holiday Inn Express Sarasota–Siesta Key Area Best Western Midtown La Quinta Inn & Suites Sarasota Country Inn & Suites I-75 Hibiscus Suites Inn Tropical Breeze Resort & Spa Hyatt Place Sarasota/Bradenton Airport AmericInn Hotel & Suites of Sarasota Homewood Suites by Hilton Sarasota Coquina On The Beach Hilton Garden Inn Sarasota–Bradenton Airport SpringHill Suites Sarasota Bradenton Comfort Inn Sarasota Sleep Inn Golden Host Resort Holiday Inn Sarasota–Airport Residence Inn Sarasota Bradenton Comfort Inn Hampton Inn Sarasota–I-75 Bee Ridge Suntide Island Beach Club Holiday Inn Sarasota–Lakewood Ranch Quality Inn & Suites Airport Days Inn Sa...

Don't use plagiarized sources. Get Your Custom Essay on
I need 1000 word literature review of two article so each 500 word
Just from $13/Page
Order Essay
Achiever Essays
Calculate your paper price
Pages (550 words)
Approximate price: -

Why Work with Us

Top Quality and Well-Researched Papers

We always make sure that writers follow all your instructions precisely. You can choose your academic level: high school, college/university or professional, and we will assign a writer who has a respective degree.

Professional and Experienced Academic Writers

We have a team of professional writers with experience in academic and business writing. Many are native speakers and able to perform any task for which you need help.

Free Unlimited Revisions

If you think we missed something, send your order for a free revision. You have 10 days to submit the order for review after you have received the final document. You can do this yourself after logging into your personal account or by contacting our support.

Prompt Delivery and 100% Money-Back-Guarantee

All papers are always delivered on time. In case we need more time to master your paper, we may contact you regarding the deadline extension. In case you cannot provide us with more time, a 100% refund is guaranteed.

Original & Confidential

We use several writing tools checks to ensure that all documents you receive are free from plagiarism. Our editors carefully review all quotations in the text. We also promise maximum confidentiality in all of our services.

24/7 Customer Support

Our support agents are available 24 hours a day 7 days a week and committed to providing you with the best customer experience. Get in touch whenever you need any assistance.

Try it now!

Calculate the price of your order

Total price:
$0.00

How it works?

Follow these simple steps to get your paper done

Place your order

Fill in the order form and provide all details of your assignment.

Proceed with the payment

Choose the payment system that suits you most.

Receive the final file

Once your paper is ready, we will email it to you.

Our Services

No need to work on your paper at night. Sleep tight, we will cover your back. We offer all kinds of writing services.

Essays

Essay Writing Service

No matter what kind of academic paper you need and how urgent you need it, you are welcome to choose your academic level and the type of your paper at an affordable price. We take care of all your paper needs and give a 24/7 customer care support system.

Admissions

Admission Essays & Business Writing Help

An admission essay is an essay or other written statement by a candidate, often a potential student enrolling in a college, university, or graduate school. You can be rest assurred that through our service we will write the best admission essay for you.

Reviews

Editing Support

Our academic writers and editors make the necessary changes to your paper so that it is polished. We also format your document by correctly quoting the sources and creating reference lists in the formats APA, Harvard, MLA, Chicago / Turabian.

Reviews

Revision Support

If you think your paper could be improved, you can request a review. In this case, your paper will be checked by the writer or assigned to an editor. You can use this option as many times as you see fit. This is free because we want you to be completely satisfied with the service offered.

Live Chat+1(978) 822-0999EmailWhatsApp

Order your essay today and save 20% with the discount code RESEARCH

slot online
seoartvin escortizmir escortelazığ escortbacklink satışbacklink saleseskişehir oto kurtarıcıeskişehir oto kurtarıcıoto çekicibacklink satışbacklink satışıbacklink satışbacklink