This example uses the `MKSHomogenizationModel`

to create a homogenization linkage for the effective stiffness. This example starts with a brief background of the homogenization theory on the components of the effective elastic stiffness tensor for a composite material. Then the example generates random microstructures and their average stress values that will be used to show how to calibrate and use our model. We will also show how to use tools from sklearn to optimize fit parameters for the `MKSHomogenizationModel`

. Lastly, the data is used to evaluate the `MKSHomogenizationModel`

for effective stiffness values for a new set of microstructures.

For this example we are looking to create a homogenization linkage that predicts the effective isotropic stiffness components for two-phase microstructures. The specific stiffness component we are looking to predict in this example is $C_{xxxx}$ which is easily accessed by applying an uniaxial macroscal strain tensor (the only non-zero component is $\varepsilon_{xx}$).

$$ u(L, y) = u(0, y) + L\bar{\varepsilon}_{xx}$$$$ u(0, L) = u(0, 0) = 0 $$$$ u(x, 0) = u(x, L) $$More details about these boundary conditions can be found in [1]. Using these boundary conditions, $C_{xxxx}$ can be estimated calculating the ratio of the averaged stress over the applied averaged strain.

$$ C_{xxxx}^* \cong \bar{\sigma}_{xx} / \bar{\varepsilon}_{xx}$$In this example, $C_{xxxx}$ for 6 different types of microstructures will be estimated, using the `MKSHomogenizationModel`

from `pymks`

, and provides a method to compute $\bar{\sigma}_{xx}$ for a new microstructure with an applied strain of $\bar{\varepsilon}_{xx}$.

```
In [23]:
```import pymks
%matplotlib inline
%load_ext autoreload
%autoreload 2
import numpy as np
import matplotlib.pyplot as plt

```
```

A set of periodic microstructures and their volume averaged elastic stress values $\bar{\sigma}_{xx}$ can be generated by importing the `make_elastic_stress_random`

function from `pymks.datasets`

. This function has several arguments. `n_samples`

is the number of samples that will be generated, `size`

specifies the dimensions of the microstructures, `grain_size`

controls the effective microstructure feature size, `elastic_modulus`

and `poissons_ratio`

are used to indicate the material property for each of the
phases, `macro_strain`

is the value of the applied uniaxial strain, and the `seed`

can be used to change the the random number generator seed.

Let's go ahead and create 6 different types of microstructures each with 200 samples with dimensions 21 x 21. Each of the 6 samples will have a different microstructure feature size. The function will return and the microstructures and their associated volume averaged stress values.

```
In [24]:
```from pymks.datasets import make_elastic_stress_random
sample_size = 200
grain_size = [(15, 2), (2, 15), (7, 7), (8, 3), (3, 9), (2, 2)]
n_samples = [sample_size] * 6
elastic_modulus = (310, 200)
poissons_ratio = (0.28, 0.3)
macro_strain = 0.001
size = (21, 21)
X, y = make_elastic_stress_random(n_samples=n_samples, size=size, grain_size=grain_size,
elastic_modulus=elastic_modulus, poissons_ratio=poissons_ratio,
macro_strain=macro_strain, seed=0)

`X`

contains the microstructure information and has the dimensions
of `(n_samples, Nx, Ny)`

. The array `y`

contains the average stress value for
each of the microstructures and has dimensions of `(n_samples,)`

.

```
In [25]:
```print(X.shape)
print(y.shape)

```
```

`draw_microstructures`

.

```
In [26]:
```from pymks.tools import draw_microstructures
X_examples = X[::sample_size]
draw_microstructures(X_examples[:3])

```
```

In this dataset 4 of the 6 microstructure types have grains that are elongated in either the x or y directions. The remaining 2 types of samples have equiaxed grains with different average sizes.

Let's look at the stress values for each of the microstructures shown above.

```
In [27]:
```print('Stress Values'), (y[::200])

```
```

`MKSHomogenizationModel`

to predict stress values for new microstructures.

The default instance of the `MKSHomogenizationModel`

takes in a dataset and

- calculates the 2-point statistics
- performs dimensionality reduction using Prinicple Component Analysis (PCA)
- and fits a polynomial regression model model to the low-dimensional representation.

This work flow has been shown to accurately predict effective properties in several examples [2][3], and requires that we specify the number of components used in dimensionality reduction and the order of the polynomial we will be using for the polynomial regression. In this example we will show how we can use tools from sklearn to try and optimize our selection for these two parameters.

In order to make an instance of the `MKSHomogenizationModel`

, we need to pass an instance of a basis (used to compute the 2-point statistics). For this particular example, there are only 2 discrete phases, so we will use the `PrimitiveBasis`

from `pymks`

. We only have two phases denoted by 0 and 1, therefore we have two local states and our domain is 0 to 1.

Let's make an instance of the `MKSHomgenizationModel`

.

```
In [28]:
```from pymks import MKSHomogenizationModel
from pymks import PrimitiveBasis
prim_basis = PrimitiveBasis(n_states=2, domain=[0, 1])
model = MKSHomogenizationModel(basis=prim_basis, periodic_axes=[0, 1],
correlations=[(0, 0), (1, 1)])

```
In [29]:
```print('Default Number of Components'), (model.n_components)
print('Default Polynomail Order'), (model.degree)

```
```

To start with, we can look at how the variance changes as a function of the number of components. In general for SVD as well as PCA, the amount of variance captured in each component decreases as the component number increases. This means that as the number of components used in the dimensionality reduction increases, the percentage of the variance will asymptotically approach 100%. Let's see if this is true for our dataset.

In order to do this we will change the number of components to 40 and then
fit the data we have using the `fit`

function. This function performs the dimensionality reduction and
also fits the regression model. Because our microstructures are periodic, we need to
use the `periodic_axes`

argument when we `fit`

the data.

```
In [30]:
```model.n_components = 40
model.fit(X, y)

`draw_component_variance`

from `pymks.tools`

.

```
In [31]:
```from pymks.tools import draw_component_variance
draw_component_variance(model.dimension_reducer.explained_variance_ratio_)

```
```

Roughly 93 percent of the variance is captured with the first 5 components. This means our model may only need a few components to predict the average stress.

Next we need to optimize the number of components and the polynomial order. To do this we are going to split the data into test and training sets. This can be done using the train_test_spilt function from `sklearn`

.

```
In [32]:
```from sklearn.cross_validation import train_test_split
flat_shape = (X.shape[0],) + (X[0].size,)
X_train, X_test, y_train, y_test = train_test_split(X.reshape(flat_shape), y,
test_size=0.2, random_state=3)
print(X_train.shape)
print(X_test.shape)

```
```

We will use cross validation with the testing data to fit a number of models, each with a different number of components and a different polynomial order. Then we will use the testing data to verify the best model. This can be done using GridSeachCV from sklearn.

We will pass a dictionary `params_to_tune`

with the range of
polynomial order `degree`

and components `n_components`

we want to try.
A dictionary `fit_params`

can be used to pass the `periodic_axes`

variable to
calculate periodic 2-point statistics. The argument `cv`

can be used to specify
the number of folds used in cross validation and `n_jobs`

can be used to specify
the number of jobs that are ran in parallel.

Let's vary `n_components`

from 1 to 11 and `degree`

from 1 to 3.

```
In [33]:
```from sklearn.grid_search import GridSearchCV
params_to_tune = {'degree': np.arange(1, 4), 'n_components': np.arange(2, 12)}
fit_params = {'size': X[0].shape}
gs = GridSearchCV(model, params_to_tune, fit_params=fit_params).fit(X_train, y_train)

`score`

method for the `MKSHomogenizationModel`

is the R-squared value. Let's look at the how the mean R-squared values and their
standard deviations change, as we varied the number of `n_components`

and `degree`

, using
`draw_gridscores_matrix`

from `pymks.tools`

.

```
In [34]:
```from pymks.tools import draw_gridscores_matrix
draw_gridscores_matrix(gs, ['n_components', 'degree'], score_label='R-Squared',
param_labels=['Number of Components', 'Order of Polynomial'])

```
```

It looks like we get a poor fit, when only the first and second component are used, and when we increase the polynomial order and the components together. The models have a high standard deviation and poor R-squared values for both of these cases.

There seems to be several potential models that use 4 to 11 components, but it's difficult to see which model
is the best. Let's use our test data `X_test`

to see which model performs the best.

```
In [35]:
```print('Order of Polynomial'), (gs.best_estimator_.degree)
print('Number of Components'), (gs.best_estimator_.n_components)
print('R-squared Value'), (gs.score(X_test, y_test))

```
```

`draw_grid_scores`

.

```
In [36]:
```from pymks.tools import draw_gridscores
gs_deg_1 = [x for x in gs.grid_scores_ \
if x.parameters['degree'] == 1][1:]
gs_deg_2 = [x for x in gs.grid_scores_ \
if x.parameters['degree'] == 2][1:]
gs_deg_3 = [x for x in gs.grid_scores_ \
if x.parameters['degree'] == 3][1:]
draw_gridscores([gs_deg_1, gs_deg_2, gs_deg_3], 'n_components',
data_labels=['1st Order', '2nd Order', '3rd Order'],
param_label='Number of Components', score_label='R-Squared')

```
```

```
In [37]:
```model = gs.best_estimator_

```
In [38]:
```model.fit(X, y)

`make_elastic_stress_random`

function.

```
In [39]:
```test_sample_size = 20
n_samples = [test_sample_size] * 6
X_new, y_new = make_elastic_stress_random(n_samples=n_samples, size=size, grain_size=grain_size,
elastic_modulus=elastic_modulus, poissons_ratio=poissons_ratio,
macro_strain=macro_strain, seed=1)

Now let's predict the stress values for the new microstructures.

```
In [40]:
```y_predict = model.predict(X_new)

`draw_components_scatter`

from `pymks.tools`

.

```
In [41]:
```from pymks.tools import draw_components_scatter
draw_components_scatter([model.reduced_fit_data[:, :2],
model.reduced_predict_data[:, :2]],
['Training Data', 'Test Data'],
legend_outside=True)

```
```

```
In [42]:
```from sklearn.metrics import r2_score
print('R-squared'), (model.score(X_new, y_new))

```
```

```
In [43]:
```print('Actual Stress '), (y_new[::20])
print('Predicted Stress'), (y_predict[::20])

```
```

`draw_goodness_of_fit`

from `pymks.tools`

.

```
In [44]:
```from pymks.tools import draw_goodness_of_fit
fit_data = np.array([y, model.predict(X)])
pred_data = np.array([y_new, y_predict])
draw_goodness_of_fit(fit_data, pred_data, ['Training Data', 'Test Data'])

```
```

`MKSHomogenizationModel`

has created a homogenization linkage for the effective stiffness for the 6 different microstructures and has predicted the average stress values for our new microstructures reasonably well.

[1] Landi, G., S.R. Niezgoda, S.R. Kalidindi, Multi-scale modeling of elastic response of three-dimensional voxel-based microstructure datasets using novel DFT-based knowledge systems. Acta Materialia, 2009. 58 (7): p. 2716-2725 doi:10.1016/j.actamat.2010.01.007.

[2] Çeçen, A., et al. "A data-driven approach to establishing microstructure–property relationships in porous transport layers of polymer electrolyte fuel cells." Journal of Power Sources 245 (2014): 144-153. doi:10.1016/j.jpowsour.2013.06.100

[3] Deshpande, P. D., et al. "Application of Statistical and Machine Learning Techniques for Correlating Properties to Composition and Manufacturing Processes of Steels." 2 World Congress on Integrated Computational Materials Engineering. John Wiley & Sons, Inc. doi:10.1002/9781118767061.ch25

```
In [ ]:
```