International Journal of Innovation, Management and Technology, Vol. 10, No.

5, October 2019

Marketing Intelligence from Data Mining Perspective — A

Literature Review
Nguyen Anh Khoa Dam, Thang Le Dinh, and William Menvielle

on the technical side; for instance, data mining models and

Abstract—The digital transformation enables enterprises to techniques to exploit big data [12]-[14]. This stimulates the
mine big data for marketing intelligence on markets, customers, research motivation and objective of this paper in identifying
products, and competitor. However, there is a lack of a
different data mining models and techniques to demystify
comprehensive literature review on this issue. With an aim to
support enterprises to accelerate the digital transformation and marketing intelligence and its specific application.
gain competitive advantages through exploiting marketing This study aims at presenting a literature review related to
intelligence from big data, this paper examines the literature in the application of data mining models to exploit marketing
the period from 2001–2018. Consequently, 76 most relevant intelligence from big data. The first objective of the study is,
articles are analyzed based on four marketing intelligence therefore, to classify different types of marketing intelligence.
components (Markets, Customers, Products, and Competitors)
and six data mining models (Association, Classification, Accordingly, the second objective is to identify pertinent
Clustering, Regression, Prediction, and Sequence Discovery). sources of big data for each type of marketing intelligence.
The findings of this study indicate that the research area of Then the final objective is to propose state-of-the-art data
product and customer intelligence receives most research mining models and techniques to uncover marketing
attention. This paper also provides a roadmap to guide future intelligence.
research on bridging marketing and information systems
through the application of data mining to exploit marketing The remaining structure of the paper continues with
intelligence from big data. methodology, theoretical background, and the proposed
framework. Then this paper will touch upon a literature
Index Terms—Big data analytics, data mining, literature review of marketing intelligence followed by the
review, marketing intelligence. classification of marketing intelligence with data mining
models and techniques. The last section indicates an in-depth
discussion of contributions as well as significant future
research directions.
The intelligence-based era has opened up incredible
opportunities for enterprises to promote the digital
transformation via new smart systems and services related to II. METHODOLOGY, THEORETICAL BACKGROUND, AND
marketing intelligence from big data [1]. Being described as PROPOSED FRAMEWORK
extremely large data sets in volume, velocity, variety, and
A. Methodology
veracity, big data is considered as a great source for
marketing intelligence [2], [3]. Marketing intelligence can be As the nature of research on marketing intelligence and
data mining spreads in various databases, this paper builds its
perceived as the process of applying data mining techniques
own literature review from different academic sources such
to gather information on customers, competitors, markets,
as Science Direct, Emerald, Business Source Premier,
industry and then is applied into strategic marketing plans [4],
EBSCOhost, ProQuest, Google Scholars, and IEEE
[5]. Thus, it enables enterprises to accelerate the digital
Transaction [15]. Different keywords such as “marketing
transformation through product innovation, customer intelligence”, “data-driven marketing”, “data mining
identification, and market demand forecasting [6], [7]. techniques”, “big data analytics” and “literature review” are
Nowadays, marketing intelligence relies on big data and used to search for articles from these reliable databases. As a
data mining techniques to gather information [7], [8]. result, the collected articles are from top marketing journals
However, the challenges related to exploiting marketing and management journals with topics in Marketing and
intelligence from big data have also arisen. Firstly, Management such as Journal of Marketing, European Journal
enterprises find it challenging to identify relevant sources of of Marketing, Harvard Business Review, Journal of the
data [2], [9], [10]. Secondly, enterprises face the challenge in Academy of Marketing, Management Science, Marketing
classifying different components of marketing intelligence as Management, etc. [12]. Furthermore, articles from
well as their specific functions [8], [11]. Finally, even though information systems journals are also synthesized; for
the application of big data in marketing has emerged as a examples: Expert Systems with Applications, MIS Quarterly,
trendy topic in many studies, there seems to be a lack of focus Decision Support Systems, IEEE, Information and
Management, Decision Support Systems, Communications
Manuscript received April 20, 2019; revised September 10, 2019. of the ACM, etc. [7]. All the selected journals in marketing
The authors are with the Marketing and Information Systems Department, and information systems are listed in the Scimago Journal &
Université du Québec à Trois-Rivières, Canada (e-mail:
Country Rank in 2017 [12]. The distribution of articles by
[email protected], [email protected],
[email protected]). journal titles is shown in Table I as follows:

International Journal of Innovation, Management and Technology, Vol. 10, No. 5, October 2019

TABLE I: THE DISTRIBUTION OF ARTICLES BY JOURNAL TITLES capabilities, organizational processes, firm attributes,
Journal title No % information, knowledge, etc.” [17] play the most ultimate
Expert Systems with Applications 10 13.16% role in strategic planning for success [18]. Another
Decision Support Systems 7 9.21% interesting point is that the resource-based theory not only
IEEE 6 7.89% considers the importance of resources but also highlights the
European Journal of Marketing 5 6.58% capabilities to exploit them and turn them into sustainable
competitive advantages [19]. With that regard, marketing
Communications of the ACM 4 5.26%
intelligence is considered as significant resources for the
Information and Management 4 5.26%
service/product enhancement and innovation [4], [6].
Journal of Business Research 4 5.26%
Correspondingly, data mining models and techniques play
MIS Quarterly 3 3.95% the role as capabilities to turn big data into marketing
Journal of Marketing 2 2.63% intelligence [3], [18]. Consequently, it is important to
Management Science 2 2.63% examine the role of marketing intelligence coupled with
ACM Transactions on Information relevant data mining models and techniques in the success of
Systems 1 1.32%
enterprises. In the next part, this study will discuss
Marketing Management 1 1.32%
components of marketing intelligence as significant
Others 27 35.53%
resources for the success of enterprises.
Total 76 100.00%
C. Proposed Marketing Intelligence Framework
According to Table I, the significant proportion of relevant From the perspective of the resource-based theory,
articles comes from Expert Systems with Applications enterprises should identify marketing intelligence coupled
(13.16%, or 10 of the 76 articles), followed by Decision with data mining models and techniques as significant
Support Systems (9.21%, or 7 articles), and the IEEE (7.89%, resources and capabilities for competition [3], [7], [17]. The
or 6 articles). Other journals, which are also well referenced, proposed framework in Fig. 2 is based on the literature
are European Journal of Marketing, Communications of the review and classification of marketing intelligence from
ACM, Information and Management, Management Science, different studies [7], [20], [21]. According to these authors,
etc. To ensure the validity and reliability of the literature marketing intelligence, which comprises of market
search process, the forward and backward search techniques intelligence, product intelligence, customer intelligence, and
are conducted to make sure that 76 chosen articles can competitor intelligence, can cover every aspect of the
represent the literature of this domain [16]. The 76 articles are marketing mix. Each component of marketing intelligence
also classified by the year of publication. Fig. 1 shows the will be discussed in detail in the following section. These four
distribution of articles by the year of publication from 2001 to components of marketing intelligence belong to the inner
2018. The 18-year duration seems long enough to see the layer of the framework. The outer layer of the framework
changes and trends in marketing intelligence. In Fig. 1, it can consists of six most typical data mining models to extract
be seen that the number of articles has grown between 2001 different components of marketing intelligence [14], [21].
and 2018 with the peak in 2013 and 2016. The period from
2009 to 2013 shows a gradual increase in the number of
articles in this domain. There is a fluctuation in the number of
articles from 2013 to 2018. This can be explained by the
research gap between practitioners and researchers. As
mentioned in the introduction part, studies on marketing
intelligence and data mining really lack papers discussing the
technical side regarding algorithms, machine learning
techniques, data mining techniques for a specific application
in marketing [12], [14]. As a matter of fact, practitioners such
as Adobe, Salesforce, or IBM conduct in-depth studies on
customer intelligence, product intelligence, and content
intelligence. However, these publications are exclusive from
academic databases.

Fig. 2. A marketing intelligence framework.

A short description of the six typical data mining models

with relevant mining techniques is provided as follows [1],
[7], [22]:
Classification is used to make a prediction on customer
Fig. 1. Distribution of articles by year of publication.
behaviors (Fan et al., 2015) or determine attributes of clusters
[23]. Classification mining techniques are neural networks,
B. Theoretical Background decision trees, the Naïve Bayes technique, support vector
This study relies on the resource-based theory [17] as this machines, market basket analysis, genetic algorithms, and
theory is well adopted by many researchers in explaining the if-then-else rules [15], [24].
key to success for enterprises. Resources such as “all assets,

International Journal of Innovation, Management and Technology, Vol. 10, No. 5, October 2019

Association is used to find out the relationship among A. Market Intelligence

products that customers purchase; thus, enterprises can As a broad concept, market intelligence consists of
determine products that tend to be sold together [22], [23] exogenous factors of a potential market such as technology,
(Bose & Mahapatra, 2001; Seng & Chen, 2010). Association competition, regulations, and other environmental forces that
techniques are association rules, statistics and Apriori may influence current and future customer needs and
algorithms [7], [15].
preferences [29]. Market intelligence comprises of a wide
Clustering is used for customer segmentation and user
range of intelligence from politics and economics to cultures
profiling [11], [25]. The most common clustering mining
and sociology [6], [28], [31]. Information on the
techniques are K-means, the Naïve Bayes technique, RFM
analysis (recency, frequency, monetary), market basket political-economic and social-cultural aspects is crucial for
analysis, neural networks, and self-organizing map enterprises to make strategic decisions when penetrating new
techniques [1], [22], [25]. markets [6], [28]. Traditional sources of market intelligence
Regression is used to make a prediction or to find causal are surveys, reports sales, discussion with customers, market
relationships among variables [15]. Common regression research, and so on [29]. Nowadays, open sources which are
techniques are linear regression and logistic regression [1], referred to unclassified, non-secret, non-indexed sources (e.g.:
[22]. weblogs, whitepapers, etc.) are also useful to collect market
Prediction is used to predict future values based on intelligence because they are inexpensive and easy to access
historical records [23], [26]. Most common forecasting [32].
mining techniques are neural networks, survival analysis,
linear regression models, market basket analysis, and logistic B. Product Intelligence
model prediction [22], [26]. Product intelligence is usually defined from the
Sequence discovery is used to identify associations or perspective of intelligent products [33], [34]. Accordingly,
describe orders of behaviors over time [27]. Sequence product intelligence includes two dimensions:
discovery techniques are statistics and set theory [15]. information-handling and decision-making [33]. This study
will reshape the definition of product intelligence from the
perspective of data mining. In this regard, product
III. MARKETING INTELLIGENCE intelligence is the application of data mining techniques to
exploit insights on products to increase customer satisfaction
In this section, the inner layer of the proposed framework
and identify business opportunities [11], [20]. The best way
consisting of different components of marketing intelligence
to satisfy customers’ needs is to listen to their opinions on
is clarified. Literature does not really demonstrate an official
products through customer reviews, discussions, attitudes on
definition of marketing intelligence. A revolutionary
forums, social media, blogs, and websites [20], [35]. These
difference can be seen from the traditional and contemporary
are considered as great sources to approach customer
definition of marketing intelligence lies in the method to
feedback and needs [11], [20]. Mining user-generated content
collect information [5], [28]. In the past, marketing
and web content will allow enterprises not only to develop
intelligence depends on market surveys and internal sources
suitable products for customer needs but also to recommend
within enterprises to gather information on customers,
the right products to the right customers [36], [37].
competitors, markets, industry [5], [29], [30]. Nowadays, the
definition of marketing intelligence infers the application of C. Competitor Intelligence
data mining models and techniques to discover marketing Competitor intelligence is information on competitors’
insights for strategic decisions [3], [7]. In this sense, products, prices, advertisements, and distribution channels
marketing intelligence carries the new name as “marketing [6]. It is also defined as the ability of an enterprise to
data intelligence” in which raw data is transformed from understand the strengths and weaknesses its competitors;
internal and external databases to uncover marketing hence, an enterprise can foresee its competitors’ moves and
intelligence [27]. Under this approach, marketing intelligence strategies and improve its performance [38]. With an intent to
is defined as the process to gather information on customers, obtain information on competitors, enterprises can collect log
competitors, markets, industry through data mining data from e-commerce websites [11]. Data on sale ranks, list
techniques and then is applied into strategic marketing plans price, customer rating, number of reviewers, and days
[4], [5]. released from e-commerce sites could be used to forecast
This paper adopts the data mining perspective from many market demand, estimate cost and price elasticity, and even
studies to define marketing intelligence as the application of evaluate the optimality of pricing strategies [11]. Nowadays,
data mining models and techniques to exploit intelligence on not only texts but also images can be mined for competitors’
markets, products, customers, and competitors [3], [7]. This products reputations [39]. Properties of images such as
definition is based on the marketing mix so that it covers all display formats, image quality, the number of views can
the important aspects that support marketing decisions [3], affect buyers’ intention, stimulate trust and improve the
[11]. As the traditional marketing mix, including product, transaction rate [39], [40].
price, promotion, and place is criticized as product-oriented
and lack of customer orientation, this study adopts the fifth P D. Customer Intelligence
– people with the focus on the customer [3], [11]. Customer intelligence consists of data and information on
Accordingly, marketing intelligence consists of market needs, preferences, cultures, lifestyle, purchasing power,
intelligence (place), product intelligence (product), shopping behaviors, customs and habits of potential
competitor intelligence (price and promotion), and customer customers [6]. In the digital age, customer intelligence is first
intelligence (people). exploited in the form of web intelligence acquired from

International Journal of Innovation, Management and Technology, Vol. 10, No. 5, October 2019

Internet Protocol searches through cookies and server logs clustering and classification are useful in dividing customers
[7]. Web intelligence uncovering customers’ needs and into homogenous segments and build customer profiles [11],
detecting business opportunities can also be collected from [12]. Customer profiles should contain information on
web pages, e-commerce sites, and social media [7], [41]. demography (age, gender), buying behaviors (needs,
Marketers can analyze customer clickstream data logs on purchasing power, preferences, lifestyle), purchasing
visit frequency, viewed items, and visit time on a website to attributes (recency, frequency, size), product category,
understand customers’ browsing habits and purchasing product mix, and estimated customer lifetime values [21],
behaviors [11], [37]. Enterprises can exploit customer [45]. In this stage, the decision tree technique in classification
intelligence from internal sources such as billing records, and K-means technique in clustering is used to group
company's weblogs, CRM system, customer surveys, etc. customer segments with similar characteristics [25]. Then
[27]. External sources of customer intelligence are lookups target customer analysis is applied to choose the most
for telephone number and address, social media, competitors' profitable segment [46].
websites, household hierarchies, Fair-Isaacs credit scores, 2) Customer attraction
customer reviews, clickstream [8], [11], [27].
With an aim to attract target segments, the classification
method takes the lead along with regression and clustering
[15]. Taking a closer look at the classification method,
Bayesian network classifier, Decision tree, Genetic algorithm,
and neural network are the most popular techniques [7], [9],
A. Data Mining for Market Intelligence [15]. In addition, RFM analysis in terms of recency,
Market intelligence can be divided into two levels: country frequency, and monetary of purchases can be applied to
level and people level [6], [31]. The country level includes comprehend customer behavior and improve direct
political-economic intelligence on legal regulations, marketing strategy to attract customers [25].
economics, technological development, market 3) Customer retention
competitiveness, and public policies [6], [28], [31]. The In order to retain customers, it is necessary to customize
people level consists of social-cultural intelligence on culture, marketing strategies that suit customer preferences and
customs, and habits, purchasing power, customer preference, behaviors [12], [47]. In fact, enterprises normally develop
income, literacy rate, education level, lifestyles, climatic customer profiling, campaign management analysis, credit
conditions, etc. [6], [28]. Different data mining models such scoring, recommender systems or loyalty programs to
as clustering, prediction, association, classification, and increase customer satisfaction and maintain a long-term
regression are conducted to extract political-economic and relationship [27], [47]. Thus, various data mining methods
social-cultural intelligence [42], [43]. Grounded in these data are applied to support those activities such as classification,
mining models, a variety of mining techniques are adopted association, clustering, sequence discovery, and regression
for different purposes. For example, sentiment and effect [15]. Accordingly, association rules, decision tree, neural
analysis to scan for market intelligence; decision tree, support network, logistic regression, and genetic algorithm are
vector machines, and logistic regression to predict and mostly discussed by researchers [1], [14], [15].
discover hidden correlations; genetic algorithms to optimize
4) Customer development
web searching; generic NLP (Natural Language processing)
rules to identify networks of distributors, suppliers, and With an aim to maximize value creation for enterprises,
collaborators; the Bayes technique to classier sentiment in customer development covers three main perspectives:
stock markets [42]-[44]. up/cross-selling, customer lifetime value and market basket
analysis [15], [47]. Sequence discovery and association are
B. Data Mining for Customer Intelligence the data mining methods supporting up/cross-selling along
Customer intelligence relates to customer relationship with market basket analysis [45], [48]. To be specific,
management (CRM) in terms of understanding customers association rules and neural network are the most common
and maximizing their values for enterprises. With the help of data mining techniques [15]. In estimating customer lifetime
customer intelligence, CRM will be able to support value, data scientists apply various mining models, including
enterprises to identify and retain the most profitable classification, clustering, forecasting, and regression [15],
customers [27]. Based on CRM, customer intelligence [22]. Thus, the corresponding data mining techniques are
consists of four levels: i) Customer identification – how to neural network, Bayesian network classifier, association
identify the most profitable customers; ii) Customer rules, linear regression, survival analysis, Markov chain
attraction – how to attract customers through marketing model [15], [42].
strategies; iii) Customer retention – how to retain a long-term
C. Data Mining for Product Intelligence
relationship with customers; iv) Customer development –
how to increase customer values [15], [25]. Based on the proposed definition of product intelligence in
the previous part, it can be seen that product intelligence
1) Customer identification
consists of two levels: product development [20], [21] and
Customer intelligence starts with identifying customer product recommendation [37], [49].
segments with similar interests and profitability [15], [21].
1) Product development
Various demographic, psychographic, behavioral or
geographic criteria are used for customer segmentation [21], Product ontologies are built through text mining web
[27]. Accordingly, customer segmentation methods such as content and user-generated content [20], [21]. Product

International Journal of Innovation, Management and Technology, Vol. 10, No. 5, October 2019

characteristics can be extracted through the text mining types (for examples: coupons, bundling, discount, gifting,
method with classification, association, and clustering sampling, etc.), and sales [37]. Regression models, especially
models [11], [21]. Product characteristics are data on size, linear regression techniques are mostly used to find out the
weight, color, packaging, and types [4], [49]. Different relationships among factors in promotional strategies [3],
techniques such as opinion mining, topic modeling, [11]. In addition, heuristic models place the top position
question-answering, information extraction are implemented along with clustering, and association models as the most
for different research objectives [7]. Topic modeling is ideal significant tools to develop recommender systems for
for finding the main theme whereas question-answering is the promotional strategies [37]. Recommender systems will be
application of natural language processing to build an able to promote suitable products for target customers,
ontology supporting human-computer interactions [20], [21]. especially for movies and shopping industry [27], [47]. The
On the other hand, opinion mining and sentiment analysis most common data mining techniques to build recommender
techniques are critical for identifying attitudes, emotions, and systems are K-Nearest Neighbor, association rule, link
feelings [50], [51]. analysis [37].
2) Product recommendation 4) Competitors’ place intelligence
This dimension of product intelligence aims to personalize Nowadays, data about locations used in the marketing mix
recommendations for each customer through data from can be obtained through location-based service [53].
clickstreams, customer profiles, mobile call records, and Location-based data can be traced through mobile devices
transactions for better customer satisfaction [37]. Among with GPS, WiFi, GSM, or Bluetooth, vehicles with GPS,
various data mining models, association, classification, smart cards (bank cards or transportation cards), floating
clustering, and regression are commonly used for building sensors (devices with radio frequency identification),
recommender systems in many studies [52]. In the same vein, check-in from social networks [13], [53]. In terms of
the associated mining techniques are K-Nearest Neighbor, location-based marketing, classification, prediction, and
Bayesian classifiers, association rules, decision trees, link regression are the most significant data mining models [11],
analysis, neural networks, linear regression [37], [52]. As one [13]. Correspondingly, various data mining techniques are
of the most important building blocks of recommender conducted such as linear/non-linear regression, the Naïve
systems, k-Nearest neighbor (k-NN) identifies users with Bayes technique, neural network, and support vector machine
similar behaviors through their preference ratings and make [13], [53].
recommendations on top products that are likely to be
purchased [49].
D. Data Mining for Competitor Intelligence
With an aim to promote the digital transformation of
As mentioned above, the competitor intelligence covers
marketing via smart services, this paper presents a literature
competitors’ 4 Ps of the marketing mix including product,
review for exploiting marketing intelligence through the
price, promotion, and place.
application of data mining on specific functions of marketing.
1) Competitors’ products intelligence Under this approach, marketing intelligence is classified
In order to monitor competitors’ product information, data based on the marketing mix. Relevant data sources, mining
scientists mine web contents with clustering, association models, and techniques are also proposed for each
model, and product ontology mining [20], [21]. Through the component of marketing intelligence. This paper builds an
application of association rules, not only product features in-depth literature review of 76 relevant articles from various
(design, labeling, brand, guarantee, etc.) but also threats of databases. The research result provides a classification on
substitute products can be extracted from customer reviews, each component of marketing intelligence with relevant
product ratings, and product descriptions [6], [20], [21]. mining models and techniques. The findings of this paper
Other data mining techniques are latent topic modeling and have several important implications.
sentiment analysis that can be used for constructing the Through this study, enterprises will be able to make good
product ontology through text mining on social media [20], use of data mining techniques in exploiting marketing
[21]. intelligence for competition. This will make great practical
2) Competitors’ pricing intelligence contributions for enterprises as marketing intelligence can
Monitoring information on competitors’ pricing offer certain competitive advantages in approaching potential
intelligence includes price strategy, discount policy, margins, market, competitors, products, and customers [3], [4]. The
credit, secure collection [6]. The most preferred data-mining proposed framework of marketing intelligence has made a
model to study pricing strategy is the regression model [3], significant theoretical contribution by bridging the gap
[11]. In particularly, multi-linear regression technique is between information systems and marketing. The framework
applied to identify determinants of competitors’ price [11]. is promised to be a source of reference for both practitioners
Along with regression models, association models are also and researchers. Researchers can rely on this framework to
useful to identify potential competitors and their pricing further their studies.
strategy [11]. This paper aims at exploiting marketing intelligence
through mining big data to accelerate the digital
3) Competitors’ promotion intelligence
transformation. Relevant data mining techniques have been
To acquire intelligence on competitors’ promotional identified for specific application in marketing. Marketing
strategy, enterprises need to obtain data on promotion time, intelligence in this paper is classified based on the marketing

International Journal of Innovation, Management and Technology, Vol. 10, No. 5, October 2019

