In [62]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
Objectives : with this notebook you will
NOTE : for more detail on sensitivity analysis in general, refer to the café focus by Jeanne Goffart (april 2017), presentation is on the server.
CAUTION : the sensitivity analysis tools should be used and analyzed with care. Without proper design, you won't be able to draw any conclusions at all, or worse, draw wrong conclusions...
and a surpriiiise !
It is a regular python dictionnary object with mandatory keys :
TRICKY ALERT Defining the bounds is one of the trickiest part of a sensitivity analysis
Do read the literature on that subject, what your fellow researchers have set as bounds for their studies etc...
Here we have chosen to focus on a specific area of all the possible values taken by the parameters : ∓ 10% around a value.
In [5]:
# STATE THE PROBLEM DICTIONNARY
# what will be varying (=inputs) ? in what bounds ?
problem = {
'num_vars': 3,
'names': ['x1', 'x2', 'x3'],
'bounds': [[-3.14159265359, 3.14159265359],
[-3.14159265359, 3.14159265359],
[-3.14159265359, 3.14159265359]]
}
In [6]:
# say we want to draw 150 samples
num_samples = 150
In [7]:
# we draw a Latin Hypercube Sampling (LHS) that is fitted for an RBD FAST analysis
# (other sampling metods available in the library though)
from SALib.sample.latin import sample
all_samples = sample(problem, num_samples)
Here, you use your own code, can be a python code like here, can call any other program (dymola, energy-plus, etc...)
We use a lumped thermal model, aka RC model.
The scientific question behind (it's not just a stupid example ^^) : can the model be entirely calibrated ?
In [12]:
# this is where you define your own model, procedure, experiment...
from SALib.test_functions import Ishigami
import numpy as np
def run_model(x1, x2, x3):
"""
COPY HERE YOUR OWN CODE
the function takes as input 1 sample
returns 1 or more output
"""
# Delete from HERE =======================================
# As an example, we'll look at the famous Ishigami function
# (A and B are specific parameters for the Ishigami function)
A = 7
B = 0.1
y = (np.sin(x1) +
A * np.sin(x2)**2 +
B * x3 ** 4 * np.sin(x1))
# ========= TO HERE and replace with your own piece of code
return y
NOTE : you could also use outputs from a different program, you would then have to upload your data into a python numpy array.
In [19]:
# run your model, procedure, experiment for each sample of your sampling
# unpack all_samples into 3 vectors x1, x2, x3
x1, x2, x3 = all_samples.T
# run the model, all samples at the same time
ishigami_results = run_model(x1, x2, x3)
SALib is a python based library to perform sensitivity analysis.
So far, it contains the following analysis tools :
Reminder : Why do we want to use RBD-FAST ??
In [17]:
from SALib.analyze import rbd_fast
In [20]:
# Let us look at the analyze method
rbd_fast.analyze(problem=problem, Y=ishigami_results, X=all_samples)
Out[20]:
It is a dictionnary with a single key 'S1', corresponding to a list of 6 items.
--> First order indices of all 6 input variables
In [21]:
# storing the first order indices of the analyze method
si1 = rbd_fast.analyze(problem=problem, Y=ishigami_results, X=all_samples)['S1']
In [23]:
# make nice plots with the indices (looks good on your presentations)
# do not use the plotting tools of SALib, they are made for the method of Morris ...
fig, ax = plt.subplots()
fig.set_size_inches(18, 5)
ax.tick_params(labelsize=18)
# ===== X-AXIS =====
ax.set_xticks(np.arange(problem['num_vars']))
ax.set_xticklabels(problem['names'])
ax.set_xlim(xmin=-0.5, xmax=problem['num_vars'] - 0.5)
# ===== Y-AXIS =====
ax.set_ylabel('RBD-FAST\nSensitivity indices\n', fontsize=25)
# ===== BARS REPRESENTING THE SENSITIVITY INDICES =====
ax.bar(np.arange(problem['num_vars']), si1,
color='DarkSeaGreen');
#in striped grey : not significant indices
ax.fill_between(x=[-0.5, 5.5], y1=-0.1, y2=0.1, color='grey', alpha=0.2, hatch='//', edgecolor='white');
In [29]:
# take a closer look to undestand interactions (looks even better on your presentations)
# this part can be done without analyzing with SALib, just with the ouput from the samples
fig, ax = plt.subplots()
fig.set_size_inches(8, 7)
ax.tick_params(labelsize=15)
ax.set_title('Output of the model studied (Ishigami)\n'
'according to the value of the 2 most influent parameters',
fontsize=20)
# ===== SCATTER =====
size = np.ones(num_samples) * 75
sc = ax.scatter(x1, x2,
c=ishigami_results,
s=size,
vmin=ishigami_results.min(),
vmax=ishigami_results.max(),
cmap='seismic',
edgecolor=None)
# ===== X-AXIS =====
ax.set_xlim(xmin=problem['bounds'][0][0], xmax=problem['bounds'][0][1])
ax.set_xlabel('Parameter 1', fontsize=20)
# ===== Y-AXIS =====
ax.set_ylim(ymin=problem['bounds'][1][0], ymax=problem['bounds'][1][1])
ax.set_ylabel('Parameter 2', fontsize=20)
# ===== COLORBAR =====
ticks = np.arange(ishigami_results.min(), ishigami_results.max(), 5)
cb = plt.colorbar(sc, ticks=ticks);
cb.ax.set_yticklabels([str(round(i,1)) for i in ticks])
fig.tight_layout();
Event without running the sensitivity analysis with the analyze module, this graph shows us some strong non-linearities (where blue is very near red). This only tells us that we should study this model with more samples, 150 is not enough.
In [53]:
def conv_study(n, Y, X):
# take n samples among the num_samples, without replacement
subset = np.random.choice(num_samples, size=n, replace=False)
return rbd_fast.analyze(problem=problem,
Y=Y[subset],
X=X[subset])['S1']
all_indices = np.array([conv_study(n=n, Y=ishigami_result, X=all_samples)
for n in np.arange(50, num_samples + 1, 5)])
In [54]:
# convergence check
fig, ax = plt.subplots()
fig.set_size_inches(15,8)
ax.set_title('Convergence of the sensitivity indices', fontsize=20)
ax.tick_params(labelsize=16)
ax.set_xlim(xmin=0, xmax=(num_samples - 50)//10)
ax.set_xticks(np.arange(0,(num_samples - 50)//10 +1,5))
ax.set_xticklabels([str(i) for i in range(50,num_samples+1,50)])
ax.set_ylim(ymin=-0.15, ymax=1)
for p,param in enumerate(problem['names']):
ax.plot(all_indices[:,p], linewidth=3, label=param)
ax.fill_between(x=[0,(num_samples - 50)//10], y1=-0.15, y2=0.1, color='grey', alpha=0.2, hatch='//', edgecolor='white',
label='REMINDER : not significant')
ax.legend(fontsize=20, ncol=4);
Last but not least the BOOTSTRAP principle : select a subset of n_sub samples, with n_sub < num_samples (say on 200 over 300) and perform the sensitivity analysis on that subset. Repeat 1000 times that operation. The indices will vary : the bigger they vary the larger the influence of the samples (aka some of the samples influence greatly the outcome of the SA).
The values taken by the indices are trustworthy if the indices show robustness in the bootstrap process (i.e. the "confidence intervals" are not too wide). In any case, it is a valuable information about the uncertainty around the indices.
In [59]:
def bootstrap(problem, Y, X):
"""
Calculate confidence intervals of rbd-fast indices
1000 draws
returns 95% confidence intervals of the 1000 indices
problem : dictionnary as SALib uses it
X : SA input(s)
Y : SA output(s)
"""
all_indices = []
for i in range(1000):
X_new = np.zeros(X.shape)
Y_new = np.zeros(Y.shape).flatten()
# draw with replacement
tirage_indices = np.random.randint(0, high=Y.shape[0], size = Y.shape[0])
for j, index in enumerate(tirage_indices):
X_new[j,:] = X[index, :]
Y_new[j] = Y[index]
all_indices.append(rbd_fast.analyze(problem=problem, Y=Y_new, X=X_new)['S1'])
means = np.array([i.mean() for i in np.array(all_indices).T])
stds = np.array([i.std() for i in np.array(all_indices).T])
return np.array([means - 2 * stds, means + 2 * stds])
In [60]:
# Get bootstrap confidence intervals for each index
bootstrap_conf = bootstrap(problem=problem, Y=ishigami_results, X=all_samples)
In [61]:
# make nice plots with the indices (looks good on your presentations)
# do not use the plotting tools of SALib, they are made for the method of Morris ...
fig, ax = plt.subplots()
fig.set_size_inches(8,4)
ax.tick_params(labelsize=18)
# ===== X-AXIS =====
ax.set_xticks(np.arange(problem['num_vars']))
ax.set_xticklabels(problem['names'])
ax.set_xlim(xmin=-0.5, xmax=problem['num_vars'] - 0.5)
# ===== Y-AXIS =====
ax.set_ylabel('RBD-FAST\nSensitivity indices\n', fontsize=25)
# ===== BARS REPRESENTING THE SENSITIVITY INDICES =====
ax.bar(np.arange(problem['num_vars']), si1,
color='DarkSeaGreen');
# ===== LINES REPRESENTING THE BOOTSTRAP "CONFIDENCE INTERVALS" =====
for j in range(problem['num_vars']):
ax.plot([j, j], [bootstrap_conf[0, j], bootstrap_conf[1, j]],
'k')
#ax.fill_between(x=[-0.5, 5.5], y1=-0.1, y2=0.1, color='grey', alpha=0.2, hatch='//', edgecolor='white');
Comment below 0.1 is not significant also because the bootstrap shows a standard deviation of +- 0.1
S. Tarantola, D. Gatelli and T. Mara (2006) “Random Balance Designs for the Estimation of First Order Global Sensitivity Indices”, Reliability Engineering and System Safety, 91:6, 717-727
Elmar Plischke (2010) “An effective algorithm for computing global sensitivity indices (EASI) Reliability Engineering & System Safety”, 95:4, 354-360. doi:10.1016/j.ress.2009.11.005
Jean-Yves Tissot, Clémentine Prieur (2012) “Bias correction for the estimation of sensitivity indices based on random balance designs.”, Reliability Engineering and System Safety, Elsevier, 107, 205-213. doi:10.1016/j.ress.2012.06.010
Jeanne Goffart, Mickael Rabouille & Nathan Mendes (2015) “Uncertainty and sensitivity analysis applied to hygrothermal simulation of a brick building in a hot and humid climate”, Journal of Building Performance Simulation. doi:10.1080/19401493.2015.1112430
Jeanne Goffart, Monika Woloszyn (2018) RBD-FAST : une méthode d’analyse de sensibilité rapide et rigoureuse pour la thermique du bâtiment, IBPSA France, Bordeaux
... Download this notebook and try for yourself to make the analysis with 250, or 500 or even 1000 samples.
Then answer these questions :