ECSI, the European Customer Satisfaction Index, is a statistical methodology to find out which factors are most important to the creation of customer satisfaction and loyalty.
ECSI was inspired by the work of the Swedish Professor Claes Fornell, one of the most influential scholars in marketing science today. His name can be found on the top most academically cited papers from the leading sources in the field, seen as an outstanding name in the study of customer satisfaction measurement and asset measurement. He was the author of the model for the Swedish economy, the Swedish Customer Satisfaction Barometer (SCSB), in 1989. Five years later, in October 1994, together with Donald C. Cook from the University of Michigan and in conjunction with the American Society for Quality and CFI Group, they have developed the American Customer Satisfaction Index (ACSI) as an economic indicator that measures the satisfaction of consumers across the U.S. economy.
A few years later, ECSI was initiated by the EU Commission in collaboration with the European Foundation for Quality Management and the European Organization for Quality (EOQ) along with a network of universities and business schools. ECSI Technical Committee has developed the method of analysis, the econometric model and the causality analysis derived from the ACSI.
The data is collected from surveys and respondents are asked to rate their experience of individual organisations they have dealt with in the previous six months, using a scale of 1 – 10, on a series of metrics covering perceptions of image, quality and price, complaint handling, as well as attitudes towards loyalty and trust. The metrics reflect the priorities customers have identified as the most important attributes of customer experience according to the research. Overall scores for each sector are the mean averages of all responses for that sector.
The ECSI score for each organisation, sector or country is the average of all of its customers’ satisfaction scores. These scores are then multiplied by ten so that the index scores are expressed as a number out of 100.
This methodological and reliable approach, offers a unique benchmark capability across all companies, industries, countries and time periods with the confidence that results are consistent and are proven over time in scientific research literature.
Structural equation modeling (SEM) was applied to assess latent variables at the observation level and test relationships between latent variables on the theoretical level.
We have considered two methods:
covariance-based technique (CB-SEM; Jöreskog 1978, 1993)
variance based partial least squares (PLS-SEM; Lohmöller 1989; Wold 1982, 1985).
Both methods have the same root (Jöreskog and Wold 1982), previous marketing research has focused primarily on CB-SEM (e.g., Bagozzi 1994; Baumgartner and Homburg 1996; Steenkamp and Baumgartner 2000).
Covariance-based SEM (CB-SEM), originally developed by Wold (1975) is primarily used to confirm theories, validate relationships between variables that can be tested measured directly or indirectly.
Lohmöller, 1989, extended CB-SEM to a second-generation technique to overcome their weaknesses and developed PLS-SEM or PLS path modeling. It is a very usefull method to develop theories in exploratory research, focused on explaining the variance in the dependent variables when examining the model.
In [1]:
import seaborn as sns
import matplotlib.pyplot as plt
plt.style.use("seaborn")
%pylab inline
#%matplotlib inline
# Scientific libraries
import pandas as pd
import numpy as np
from scipy import stats
from sklearn.decomposition import PCA
from sklearn.mixture import GaussianMixture
from warnings import filterwarnings
In [2]:
mqm = pd.read_csv('../BD_MQM.csv', index_col = "Respondent", sep=";",
encoding="utf8", skipinitialspace=True, na_values=['9999','99'])
The variate: Each construct is a linear combination of several variables that are chosen based on the questions of the questionnaire. The responses from 1758 respondents were arranged in a data matrix. All factor loading were significant, they were all greater than 0.50 as suggested in Hair, Black, Babin, Anderson, 2010.
In [3]:
image = ['Q4A', 'Q4B', 'Q4C', 'Q4D', 'Q4E']
expectations = ['Q5A', 'Q5B','Q5C']
perceivedQuality = ['Q6', 'Q7A', 'Q7B', 'Q7C', 'Q7D', 'Q7E', 'Q7F', 'Q7G', 'Q7H']
perceivedValue = ['Q10', 'Q11']
satisfaction = ['Q3', 'Q9', 'Q18']
complaints = ['Q1516']
loyalty = ['Q12', 'Q17']
socio = ['Bank', 'B1', 'B5', 'B6', 'B8']
The measurement: the measured phenomenon is abstract, complex, and not directly observable. Satisfaction, loyalty and perceived image, value and quality are latent variables (constructs) and unobservable. However, the indicators or manifestations of each item represents a single separate aspect, or proxie, of those larger abstract concepts.
The combination of several items in a scale, a multi-item scale, was used to indirectly measure each of those concepts forming a single accurate composite score - the variate score.
In [4]:
# Image and Expections constructs response patterns
fig, ax = plt.subplots(1,2, figsize=(14, 6))
mqm[image].sort_values(by=('Q4A'), ascending=False).T.plot(legend=False, alpha=0.01, ax=ax[0])
mqm[expectations].sort_values(by=('Q5A'), ascending=False).T.plot(legend=False, alpha=0.01, ax=ax[1])
ax[0].set_title("Image")
ax[1].set_title("Expectations")
Out[4]:
In [5]:
# Perceived Quality and Value constructs response patterns
fig, ax = plt.subplots(1,2, figsize=(14, 6))
mqm[perceivedQuality].sort_values(by=('Q6'), ascending=False).T.plot(legend=False, alpha=0.01, ax=ax[0])
mqm[perceivedValue].sort_values(by=('Q10'), ascending=False).T.plot(legend=False, alpha=0.01, ax=ax[1])
ax[0].set_title("Perceived Quality")
ax[1].set_title("Perceived Value")
Out[5]:
In [6]:
# Satisfaction and Loyalty constructs response patterns
fig, ax = plt.subplots(1,2, figsize=(14, 6))
mqm[satisfaction].sort_values(by=('Q3'), ascending=False).T.plot(legend=False, alpha=0.01, ax=ax[0])
mqm[loyalty].sort_values(by=('Q12'), ascending=False).T.plot(legend=False, alpha=0.01, ax=ax[1])
ax[0].set_title("Satisfaction")
ax[1].set_title("Loyalty")
Out[6]:
The measurement scales: For each question, an interval scale was used, from 1 to 10.
The coding: Was assigned numbers of each response to a point in the scale in a manner that facilitates measurement and the equidistant attribute were preserved in the interval-level measurement.
The data distributions: the answers to the questions asked were measured and the frequencies of each corresponding variable are presented in the flollowing table.
In [7]:
mqm.describe()
Out[7]:
When working with CB-SEM, the properties of normally behaved variables are allmost desirable, and Shapiro-Wilk test statistic is used in the experiment.
Nevertheless, PLS-SEM is a nonparametric approach, a distribution free technique, and methodologicaly, when working in PLS-SEM, the assumption of normality is not required. However, knowing the distribution of the variables under analysis provides the researcher with a better insight of the respondents behavior.
In [8]:
# plot the sum of image = ['Q4A', 'Q4B', 'Q4C', 'Q4D', 'Q4E']
mqm[image].sum().plot()
Out[8]:
In [9]:
for _ in image:
mqm[_].hist(histtype='step', stacked=True, fill=False)
In [10]:
def PCAGM(data):
# PCA - Principal Components Analysis
pcaN = PCA(n_components=3, svd_solver='full').fit_transform(data.fillna(0))
# Gaussian Mixture
gmN = GaussianMixture(3).fit(data.fillna(0))
labelsN = gmN.predict(data.fillna(0))
fig, ax = plt.subplots(1,2, figsize=(15, 5))
ax[0].set_title("PC1 - PC2")
ax[0].scatter(pcaN[:,0], pcaN[:,1], c=labelsN, cmap='rainbow', alpha=0.5)
ax[1].set_title("PC1 - PC3")
ax[1].scatter(pcaN[:,0], pcaN[:,2], c=labelsN, cmap='rainbow', alpha=0.5)
return
In [11]:
# Generate Random Normal Sample
mu, sigma, dimensao = (7-0.7)/2, 1, len(mqm) # mean and standard deviation
mqmNormal = pd.DataFrame(np.random.normal(mu, sigma, dimensao * 5).reshape((dimensao, 5)),
index = np.arange(len(mqm)), columns=image)
PCAGM(mqmNormal)
In [12]:
PCAGM(mqm[image]) # image = ['Q4A', 'Q4B', 'Q4C', 'Q4D', 'Q4E']
Since the Image construct is a linear combination of five variables, we will use the t-Distributed Stochastic Neighbor Embedding (t-SNE) to reduce the dimensionality of data to 2 or 3 dimensions so that we can plot our 1758 data points. It is a new unsupervised dimensionality reduction technique that learns a parametric mapping between the high-dimensional data space and the low-dimensional latent space. Parametric t-SNE learns the parametric mapping in such a way that the local structure of the data is preserved as well as possible in the latent space.
The similarities between data points to joint probabilities are converted and the Kullback-Leibler divergence between the joint probabilities of the low-dimensional embedding and the high-dimensional data are minimized. However, t-SNE has a cost function that is not convex, i.e. with different initializations we can get different results.
First, we compute conditional probabilites:
$$p_{j|i} = \frac{\exp{(-d(\boldsymbol{x}_i, \boldsymbol{x}_j) / (2 \sigma_i^2)})}{\sum_{i \neq k} \exp{(-d(\boldsymbol{x}_i, \boldsymbol{x}_k) / (2 \sigma_i^2)})}, \quad p_{i|i} = 0,$$to compute the joint probabilities:
$$p_{ij} = \frac{p_{j|i} + p_{i|j}}{2N}.$$A heavy-tailed distribution will be used to measure the similarities in the embedded space
$$q_{ij} = \frac{(1 + ||\boldsymbol{y}_i - \boldsymbol{y}_j)||^2)^{-1}}{\sum_{k \neq l} (1 + ||\boldsymbol{y}_k - \boldsymbol{y}_l)||^2)^{-1}},$$You can find more details in Laurens van der Maaten, 2009, Learning a Parametric Embedding by Preserving Local Structure.
Many parametric dimensionality reduction techniques, such as PCA and NCA (Goldberger et al., 2005), are hampered by their linear nature, which makes it difficult to successfully embed highly non-linear real-world data in the latent space. This new unsupervised parametric dimensionality reduction technique attempts to retain the local data structure in the latent space, parametrizing the non-linear mapping between the data space and the latent space by means of a feed-forward neural network.
In [13]:
from sklearn.manifold import TSNE
def ecsiTSNE(data):
X_tsne = TSNE(learning_rate=100).fit_transform(data.fillna(0))
X_pca = PCA().fit_transform(data.fillna(0))
# Gaussian Mixture
gm = GaussianMixture(3).fit(data.fillna(0))
labelsGM = gm.predict(data.fillna(0))
figure(figsize=(10, 5))
subplot(121)
scatter(X_tsne[:, 0], X_tsne[:, 1], c=labelsGM, cmap='rainbow', alpha=0.5)
subplot(122)
scatter(X_pca[:, 0], X_pca[:, 1], c=labelsGM, cmap='rainbow', alpha=0.5)
return
In [14]:
ecsiTSNE(mqmNormal)
In [15]:
ecsiTSNE(mqm[image]) # image = ['Q4A', 'Q4B', 'Q4C', 'Q4D', 'Q4E']
In [16]:
ecsiTSNE(mqm[expectations]) # expectations = ['Q5A', 'Q5B','Q5C']
In [17]:
ecsiTSNE(mqm[perceivedQuality]) # perceivedQuality = ['Q6', 'Q7A', 'Q7B', 'Q7C', 'Q7D', 'Q7E', 'Q7F', 'Q7G', 'Q7H']
In [18]:
ecsiTSNE(mqm[socio]) # socio = ['Bank', 'B1', 'B5', 'B6', 'B8']
The general form of the ECSI structural model is:
$$\eta =\beta\eta + \gamma \xi + \nu $$$$E( \nu | \xi) = 0$$where $\eta = (\eta_1, \eta_2, \dots, \eta_6)$ is the vector of endogenous latent variables:
$\xi$ is the exogenous latent variable (image), $\beta$ is the matrix of coeficients of $\eta$, $\gamma$ is the vector of coeficients of $\xi$, and $\nu$ is the vector of errors.
$\begin{bmatrix}\eta_1\\ \eta_2\\ \eta_3\\ \eta_4\\ \eta_5\\ \eta_6\\ \end{bmatrix} = \begin{bmatrix} 0&0&0&0&0&0\\ \beta_{21}&0&0&0&0&0 \\ \beta_{31}&\beta_{32}&0&0&0&0 \\ \beta_{41}&\beta_{42}&\beta_{43}&0&0&0 \\ 0&0&0&\beta_{54}&0&0 \\ 0&0&0&\beta_{64}&\beta_{65}&0 \\ \end{bmatrix} \begin{bmatrix}\eta_1\\ \eta_2\\ \eta_3\\ \eta_4\\ \eta_5\\ \eta_6\\ \end{bmatrix} + \begin{bmatrix} \gamma_1\\ 0 \\ 0 \\ \gamma_4\\ 0 \\ \gamma_6\\ \end{bmatrix} \xi + \begin{bmatrix} \nu_1\\ \nu_2 \\ \nu_3 \\ \nu_4 \\ \nu_5 \\ \nu_6 \\ \end{bmatrix}$
In [19]:
import networkx as nx
import pylab
G = nx.DiGraph()
G.add_node(1,pos="100,100")
G.add_node(2,pos="0,0")
G.add_node(3,pos="200,0")
G.add_edge(1,2)
G.add_edge(1,3)
G.add_edge(1,1)
print(G.edges(data=True))
# [(1, 1, {}), (1, 2, {}), (1, 3, {})]
nx.drawing.nx_pydot.write_dot(G,'graph.dot')
# use -n to suppress node positioning (routes edges)
# run dot -n -Tpng graph.dot >graph.png
The responses are requested using a 10-point scale, then the distribution of the answers in each of the possible response categories (1, 2, 3, ... , 10) can be calculated and displayed in a table or chart.
| Questão | Descrição | Valores |
|---|---|---|
| Q2 | Ano a partir do qual o inquirido é cliente do banco | 9999 – NS/NR |
| Q3 | Grau de satisfação global com o banco | 1_10 99: NS/NR |
| Q4A | Banco de confiança no que diz e no que faz | 1_10 99: NS/NR |
| Q4B | Banco perfeitamente implantado no mercado | 1_10 99: NS/NR |
| Q4C | Contribui positivamente para a sociedade | 1_10 99: NS/NR |
| Q4D | Preocupa-se com os seus clientes | 1_10 99: NS/NR |
| Q4E | Banco inovador e virado para o futuro | 1_10 99: NS/NR |
| Q5A | Expectativas que tinha há seis meses atrás ou quando se tornou cliente do banco relativamente à qualidade global do mesmo | 1_10 99: NS/NR |
| Q5B | Expectativas que tinha há seis meses atrás ou quando se tornou cliente do banco relativamente à capacidade do mesmo oferecer produtos e serviços que satisfizessem as suas necessidades pessoais | 1_10 99: NS/NR |
| Q5C | Expectativas que tinha há seis meses atrás ou quando se tornou cliente do banco relativamente à capacidade do mesmo evitar falhas ou erros | 1_10 99: NS/NR |
| Q6 | Qualidade apercebida do banco | 1_10 99: NS/NR |
| Q7A | Qualidade dos produtos e serviços bancários oferecidos | 1_10 99: NS/NR |
| Q7B | Atendimento e capacidade de aconselhamento | 1_10 99: NS/NR |
| Q7C | Acessibilidade a produtos e serviços por via das novas tecnologias | 1_10 99: NS/NR |
| Q7D | Fiabilidade dos produtos e serviços oferecidos | 1_10 99: NS/NR |
| Q7E | Diversidade de produtos e serviços | 1_10 99: NS/NR |
| Q7F | Clareza e transparência na informação fornecida | 1_10 99: NS/NR |
| Q7G | Disponibilidade das agências | 1_10 99: NS/NR |
| Q7H | Qualidade das agências | 1_10 99: NS/NR |
| Q9 | Realização das expectativas percepcionadas | 1_10 99: NS/NR |
| Q10 | Classificação dos preços e das taxas dos produtos e serviços do banco, dada a qualidade dos mesmos | 1_10 99: NS/NR |
| Q11 | Classificação da qualidade dos produtos e serviços do banco, dados os preços e as taxas dos mesmos | 1_10 99: NS/NR |
| Q12 | Probabilidade de voltar a escolher o banco no caso de adquirir um produto ou serviço bancário | 1_10 99: NS/NR |
| Q13 | Diferença a partir da qual mudaria de banco face à redução de comissões, juros e outras taxas em outros bancos | 0_100(%) 222: FicaSempreBanco 999: NS/NR |
| Q14 | Apresentação de reclamações | 1: Sim 2: Não |
| Q15 | (Caso Q14=1) Avaliação da forma como foi tratada a reclamação apresentada | 1_10 99: NS/NR |
| Q16 | (Caso Q14=2) Expectativa na forma de resolução de uma possível reclamação | 1_10 99: NS/NR |
| Q17 | Probabilidade de recomendar o banco a outras pessoas | 1_10 99: NS/NR |
| Q18 | Medida de proximidade do banco do cliente a um banco que considere ideal | 1_10 9 9: NS/NR |
</pre> Nota: Nas respostas com escala de 1 a 10, os valores de 1 a 5 correspondem a avaliações negativas e os valores de 6 a 10 correspondem a avaliações positivas.
| Questão | Descrição | Valores |
|---|---|---|
| B1 | Género do inquirido | 1-Feminino 2-Masculino |
| B2 | Ano de nascimento do inquirido | (últimos dois dígitos) 99:Revelação recusada |
| B5 | Situação profissional | 1 – Empregado 2 – Desempregado 3 – Estudante 4 – Doméstico 5 – Reformado 6 – Outra 9 – NS/NR |
| B6 | (Caso B5=1 ou B5=6) Situação perante a actividade profissional | 1 – Patrão 2 – Trabalhador por conta própria 3 – Trabalhador por conta de outrem 4 – Outra 9 – NS/NR |
| B8 | Nível de instrução escolar | 1 – Não sabe ler nem escrever 2 – Sabe ler e escrever sem possuir grau de ensino 3 – Ensino básico elementar 4 – Ensino básico preparatório 5 – Ensino secundário unificado 6 – Ensino secundário complementar 7 – Cursos médios 8 – Ensino superior incompleto 9 – Ensino superior completo ou mais 99 – NS/NR |
</pre>
In [20]:
labels = pd.get_dummies(mqm[socio], columns=['Bank', 'B1', 'B5', 'B6', 'B8'])
features = mqm[['Q2', 'Q3', 'Q4A', 'Q4B', 'Q4C', 'Q4D', 'Q4E', 'Q5A', 'Q5B', 'Q5C', 'Q6', 'Q7A', 'Q7B',
'Q7C', 'Q7D', 'Q7E', 'Q7F', 'Q7G', 'Q7H', 'Q9', 'Q10', 'Q11', 'Q12', 'Q13', 'Q14',
'Q1516', 'Q17', 'Q18', 'B2']]
In [21]:
labels.columns = ['Bank_1', 'Bank_2', 'Bank_3', 'Bank_4', 'Bank_5', 'Bank_6', 'Bank_7',
'B1_1Feminino', 'B1_2Masculino ', 'B1_9NR',
'B5_1Empregado', 'B5_2Desempregado', 'B5_3Estudante',
'B5_4Domestico', 'B5_5Reformado', 'B5_6Outra', 'B5_9NS/NR',
'B6_1Patrão', 'B6_2TrabalhadorContaPropria', 'B6_3TrabalhadorContaOutrem',
'B6_4Outra', 'B6_9NS_NR ',
'B8_2SabeLerEscrever', 'B8_3EnsinoBasicoElementar', 'B8_4EnsinoBasicoPreparatorio',
'B8_5EnsinoSecundarioUnificado', 'B8_6EnsinoSecundarioComplementar',
'B8_7CursoMedio', 'B8_8EnsinoSuperiorIncompleto', 'B8_9EnsinoSuperiorCompleto']
In [22]:
from sklearn.model_selection import cross_val_score, train_test_split
from sklearn.metrics import accuracy_score
from sklearn.preprocessing import Imputer
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report, confusion_matrix
logitScores = {}
logitCoefs = pd.DataFrame(index=features.columns, columns = labels.columns)
for label in labels:
train_data, test_data, train_labels, test_labels = train_test_split(features, labels[[label]],
random_state=123)
imp = Imputer()
imp.fit(train_data)
train_data = imp.transform(train_data)
test_data = imp.transform(test_data)
modelLogit = LogisticRegression()
modelLogit = LogisticRegression().fit(train_data, train_labels.values.ravel())
score = modelLogit.score(test_data, test_labels.values.ravel())
#print('#', list(labels[[label]]), "Logit Score: %f" % score)
#print(confusion_matrix(test_labels, modelLogit.predict(test_data)))
logitScores[label] = score
logitCoefs[label] = modelLogit.coef_.T
#print(" ")
logitScores = pd.DataFrame.from_dict([logitScores]).reindex(columns=logitCoefs.columns)
In [23]:
logitScores.T.plot(legend=False)
Out[23]:
In [24]:
logitCoefs.T.plot(legend=False, alpha=0.25)
Out[24]:
In [25]:
logitCoefs.T.head()
Out[25]:
In [26]:
from sklearn.decomposition import PCA
pca = PCA().fit_transform(logitCoefs)
plt.plot(pca[:,0],pca[:,1], 'ro')
plt.axis([-1, 1, -1, 1])
plt.show()
In [27]:
#sns.set(style="white")
sns.set_context("paper")
# Generate a large random dataset
rs = np.random.RandomState(33)
#d = pd.DataFrame(logitCoefs.corr(), columns=labels.columns)
# Compute the correlation matrix
corr = logitCoefs.corr()
corr[corr > 0.25] = 1
corr[corr < -0.25] = -1
# Generate a mask for the upper triangle
mask = np.zeros_like(corr, dtype=np.bool)
mask[np.triu_indices_from(mask)] = True
# Set up the matplotlib figure
f, ax = plt.subplots(figsize=(11, 9))
# Generate a custom diverging colormap
cmap = sns.diverging_palette(200, 10, as_cmap=True)
# Draw the heatmap with the mask and correct aspect ratio
sns.heatmap(corr, mask=mask, cmap=cmap, vmax=.3,
square=True, xticklabels=True, yticklabels=True,
linewidths=.5, cbar_kws={"shrink": .5}, ax=ax)
Out[27]: