Partial Least Squares Regression (PLSR) on Near Infrared Spectroscopy (NIR) data and octane data

This notebook illustrates how to use the hoggorm package to carry out partial least squares regression (PLSR) on multivariate data. Furthermore, we will learn how to visualise the results of the PLSR using the hoggormPlot package.


Import packages and prepare data

First import hoggorm for analysis of the data and hoggormPlot for plotting of the analysis results. We'll also import pandas such that we can read the data into a data frame. numpy is needed for checking dimensions of the data.


In [1]:
import hoggorm as ho
import hoggormplot as hop
import pandas as pd
import numpy as np

Next, load the data that we are going to analyse using hoggorm. After the data has been loaded into the pandas data frame, we'll display it in the notebook.


In [2]:
# Load fluorescence data
X_df = pd.read_csv('gasoline_NIR.txt', header=None, sep='\s+')
X_df


Out[2]:
0 1 2 3 4 5 6 7 8 9 ... 391 392 393 394 395 396 397 398 399 400
0 -0.050193 -0.045903 -0.042187 -0.037177 -0.033348 -0.031207 -0.030036 -0.031298 -0.034217 -0.036012 ... 1.198461 1.224243 1.242645 1.250789 1.246626 1.250985 1.264189 1.244678 1.245913 1.221135
1 -0.044227 -0.039602 -0.035673 -0.030911 -0.026675 -0.023871 -0.022571 -0.025410 -0.028960 -0.032740 ... 1.128877 1.148342 1.189116 1.223242 1.253306 1.282889 1.215065 1.225211 1.227985 1.198851
2 -0.046867 -0.041260 -0.036979 -0.031458 -0.026520 -0.023346 -0.021392 -0.024993 -0.029309 -0.033920 ... 1.147964 1.167798 1.198287 1.237383 1.260979 1.276677 1.218871 1.223132 1.230321 1.208742
3 -0.046705 -0.042240 -0.038561 -0.034513 -0.030206 -0.027680 -0.026042 -0.028280 -0.030920 -0.034012 ... 1.160089 1.169350 1.201066 1.233299 1.262966 1.272709 1.211068 1.215044 1.232655 1.206696
4 -0.050859 -0.045145 -0.041025 -0.036357 -0.032747 -0.031498 -0.031415 -0.034611 -0.037781 -0.040752 ... 1.252712 1.238013 1.259616 1.273713 1.296524 1.299507 1.226448 1.230718 1.232864 1.202926
5 -0.048094 -0.042739 -0.038812 -0.034017 -0.030143 -0.027690 -0.026387 -0.028811 -0.031481 -0.034124 ... 1.214046 1.210217 1.241090 1.262138 1.288401 1.291118 1.229769 1.227615 1.227630 1.207576
6 -0.049906 -0.044558 -0.040543 -0.035716 -0.031844 -0.029581 -0.027915 -0.030292 -0.033590 -0.037184 ... 1.234174 1.226153 1.245143 1.265648 1.274731 1.292441 1.218317 1.218147 1.222273 1.200446
7 -0.049293 -0.043788 -0.039429 -0.034193 -0.029588 -0.026455 -0.025104 -0.028102 -0.031801 -0.036157 ... 1.187996 1.192901 1.222581 1.245782 1.260020 1.290305 1.221264 1.220265 1.227947 1.188174
8 -0.049885 -0.044279 -0.040158 -0.034954 -0.031114 -0.028390 -0.027017 -0.029609 -0.032937 -0.036862 ... 1.219162 1.214365 1.234969 1.251559 1.272416 1.287405 1.211995 1.213263 1.215883 1.196102
9 -0.051054 -0.045678 -0.041673 -0.036761 -0.033078 -0.030466 -0.029295 -0.031736 -0.034843 -0.038419 ... 1.227318 1.224755 1.238409 1.262493 1.272277 1.289548 1.213103 1.212666 1.216313 1.192221
10 -0.052705 -0.047674 -0.043960 -0.039335 -0.035622 -0.033849 -0.032669 -0.035076 -0.037459 -0.040534 ... 1.251647 1.236881 1.252961 1.268144 1.288349 1.303091 1.220515 1.218996 1.218947 1.196750
11 -0.050383 -0.044934 -0.041391 -0.036162 -0.032389 -0.030479 -0.028614 -0.031738 -0.034432 -0.036488 ... 1.236618 1.242923 1.271185 1.284266 1.316014 1.231520 1.242926 1.245499 1.218605 1.222376
12 -0.047866 -0.043572 -0.039234 -0.033831 -0.029675 -0.026899 -0.025460 -0.028133 -0.031637 -0.035215 ... 1.199439 1.216346 1.243973 1.265293 1.283287 1.257243 1.241994 1.245265 1.221914 1.227010
13 -0.046594 -0.041111 -0.036881 -0.031122 -0.026667 -0.023717 -0.021758 -0.024917 -0.029152 -0.033473 ... 1.150171 1.162515 1.196462 1.221030 1.245689 1.255820 1.195502 1.201374 1.217044 1.190482
14 -0.042470 -0.036621 -0.032430 -0.026807 -0.021276 -0.018356 -0.016116 -0.019680 -0.024589 -0.029472 ... 1.107501 1.147547 1.173971 1.213914 1.239907 1.263985 1.236485 1.213466 1.222574 1.194232
15 -0.048503 -0.043850 -0.040052 -0.035608 -0.031709 -0.029417 -0.027995 -0.030589 -0.032894 -0.035546 ... 1.231412 1.247899 1.253355 1.274886 1.295021 1.300102 1.272523 1.237184 1.253393 1.225255
16 -0.052011 -0.046438 -0.042741 -0.037767 -0.033675 -0.031435 -0.030016 -0.032774 -0.035619 -0.039340 ... 1.221685 1.225224 1.241605 1.251055 1.276291 1.257590 1.211339 1.224403 1.196473 1.206524
17 -0.055093 -0.049515 -0.045637 -0.040658 -0.036019 -0.033858 -0.032356 -0.035238 -0.038240 -0.042084 ... 1.208462 1.209117 1.232265 1.255110 1.267071 1.271932 1.208274 1.202633 1.220311 1.192021
18 -0.055002 -0.049353 -0.045749 -0.040881 -0.036641 -0.034485 -0.032852 -0.035723 -0.038415 -0.041081 ... 1.185953 1.187329 1.216092 1.239588 1.259853 1.281106 1.213803 1.216212 1.212221 1.187919
19 -0.053971 -0.048498 -0.044546 -0.039737 -0.035025 -0.032028 -0.030581 -0.032772 -0.036772 -0.040725 ... 1.189496 1.192654 1.222480 1.247473 1.265223 1.279126 1.215044 1.215047 1.224300 1.195850
20 -0.056393 -0.051917 -0.048119 -0.042835 -0.038890 -0.037075 -0.035381 -0.038580 -0.041193 -0.044240 ... 1.209946 1.224277 1.250597 1.262213 1.286013 1.225083 1.224663 1.233307 1.198024 1.208770
21 -0.041806 -0.037138 -0.033330 -0.028394 -0.024088 -0.022220 -0.020429 -0.023380 -0.026519 -0.029953 ... 1.214297 1.223688 1.251757 1.271028 1.299461 1.233623 1.238265 1.239621 1.216958 1.215150
22 -0.056295 -0.050792 -0.047015 -0.041668 -0.037211 -0.035112 -0.033790 -0.036529 -0.039416 -0.043021 ... 1.203184 1.207357 1.237480 1.250352 1.273242 1.212260 1.227581 1.228188 1.200708 1.208866
23 -0.056614 -0.050934 -0.047065 -0.042162 -0.037512 -0.035385 -0.033810 -0.036981 -0.039910 -0.043189 ... 1.211848 1.220556 1.241747 1.260483 1.274427 1.227429 1.232069 1.232610 1.203415 1.206482
24 -0.056634 -0.050985 -0.047449 -0.042544 -0.037751 -0.035638 -0.034164 -0.037250 -0.040238 -0.043446 ... 1.214928 1.216910 1.240362 1.258414 1.278576 1.250313 1.225191 1.234871 1.202831 1.200294
25 -0.053835 -0.048211 -0.043901 -0.039466 -0.034951 -0.032682 -0.031312 -0.033782 -0.037050 -0.040557 ... 1.222141 1.223657 1.243059 1.268803 1.279535 1.292552 1.223150 1.230409 1.234873 1.199130
26 -0.054568 -0.049352 -0.045221 -0.040954 -0.036456 -0.034173 -0.033066 -0.035521 -0.038449 -0.041375 ... 1.203253 1.203405 1.229612 1.244302 1.267760 1.277744 1.209736 1.214851 1.218626 1.185845
27 -0.056343 -0.050790 -0.046753 -0.042718 -0.038384 -0.036067 -0.034715 -0.037117 -0.040004 -0.042628 ... 1.215054 1.214108 1.236832 1.264889 1.278272 1.288166 1.220597 1.222195 1.237040 1.198483
28 -0.055746 -0.050452 -0.046133 -0.042041 -0.037684 -0.035340 -0.034286 -0.036270 -0.039331 -0.042332 ... 1.207200 1.214645 1.232480 1.250810 1.269168 1.284636 1.221910 1.220088 1.225551 1.190114
29 -0.056285 -0.051229 -0.047233 -0.043306 -0.038566 -0.036586 -0.035222 -0.037604 -0.040532 -0.043434 ... 1.229997 1.227048 1.249672 1.267421 1.284605 1.304134 1.228024 1.230893 1.224984 1.209100
30 -0.055856 -0.050983 -0.047003 -0.042624 -0.038003 -0.035975 -0.034708 -0.036853 -0.039795 -0.042890 ... 1.222627 1.222856 1.242992 1.264961 1.278480 1.291149 1.223628 1.232818 1.223925 1.203394
31 -0.054979 -0.049543 -0.045299 -0.041173 -0.036667 -0.034132 -0.033121 -0.035130 -0.037817 -0.040240 ... 1.187338 1.193676 1.215842 1.248764 1.270184 1.282696 1.219395 1.230635 1.218142 1.198047
32 -0.056744 -0.051640 -0.047625 -0.043418 -0.038720 -0.036322 -0.035358 -0.037381 -0.040105 -0.042422 ... 1.186506 1.185219 1.216441 1.240234 1.261010 1.276013 1.210541 1.219328 1.221799 1.188120
33 -0.055116 -0.049883 -0.045198 -0.041241 -0.036557 -0.034154 -0.033351 -0.035328 -0.038093 -0.040302 ... 1.186216 1.191344 1.217995 1.242712 1.265151 1.273448 1.216081 1.219200 1.223790 1.200368
34 -0.055431 -0.049610 -0.046254 -0.041308 -0.037308 -0.034262 -0.034115 -0.035523 -0.038780 -0.040580 ... 1.197016 1.199079 1.230152 1.255975 1.271034 1.287976 1.222506 1.232113 1.233227 1.215170
35 -0.054786 -0.049772 -0.045728 -0.041781 -0.037103 -0.034873 -0.032462 -0.035916 -0.038543 -0.043005 ... 1.208933 1.223582 1.253362 1.270257 1.286046 1.222422 1.236444 1.226974 1.207932 1.208693
36 -0.052696 -0.047364 -0.043219 -0.039882 -0.035381 -0.032813 -0.031885 -0.034334 -0.037243 -0.038527 ... 1.218744 1.225558 1.255617 1.278059 1.289276 1.304098 1.228055 1.248893 1.238919 1.219423
37 -0.051488 -0.045710 -0.041979 -0.037985 -0.034024 -0.030727 -0.029478 -0.031468 -0.036109 -0.038571 ... 1.228883 1.255432 1.259085 1.283364 1.290963 1.303616 1.299003 1.247123 1.242375 1.253576
38 -0.050822 -0.045340 -0.040816 -0.036766 -0.031458 -0.029078 -0.027847 -0.029164 -0.034641 -0.037432 ... 1.191156 1.213603 1.228284 1.252362 1.284587 1.299535 1.308261 1.240087 1.237938 1.249060
39 -0.053711 -0.047820 -0.043375 -0.039730 -0.035277 -0.032090 -0.031238 -0.033153 -0.037076 -0.040173 ... 1.212733 1.242758 1.242135 1.267099 1.291053 1.302889 1.311104 1.242112 1.247460 1.241335
40 -0.052652 -0.046447 -0.043614 -0.040247 -0.035748 -0.033943 -0.031325 -0.035319 -0.037886 -0.040906 ... 1.276561 1.268445 1.291689 1.289337 1.307472 1.249372 1.245525 1.247599 1.228605 1.225721
41 -0.050152 -0.044052 -0.040550 -0.036536 -0.032156 -0.030090 -0.028906 -0.031714 -0.034768 -0.037636 ... 1.237985 1.236775 1.250118 1.267737 1.281177 1.258787 1.230622 1.229405 1.232895 1.204329
42 -0.045382 -0.040226 -0.036527 -0.032673 -0.028697 -0.026225 -0.024899 -0.026252 -0.031305 -0.033514 ... 1.224900 1.261425 1.263564 1.274996 1.292608 1.305140 1.279795 1.254112 1.249215 1.221268
43 -0.050142 -0.044155 -0.040605 -0.036775 -0.032357 -0.029566 -0.028514 -0.029725 -0.033475 -0.035601 ... 1.196784 1.224243 1.231407 1.255565 1.285385 1.300601 1.288432 1.243036 1.241742 1.239797
44 -0.055431 -0.049375 -0.046190 -0.042031 -0.037362 -0.035388 -0.033041 -0.036786 -0.039450 -0.042702 ... 1.223531 1.245309 1.244297 1.270138 1.284427 1.295979 1.228903 1.236879 1.236562 1.200461
45 -0.062839 -0.056232 -0.053075 -0.048133 -0.044493 -0.041588 -0.040467 -0.043202 -0.046477 -0.049311 ... 1.218801 1.242566 1.278722 1.269426 1.288154 1.295843 1.296007 1.234071 1.244711 1.239300
46 -0.060146 -0.054662 -0.051013 -0.046707 -0.042162 -0.040352 -0.038058 -0.041425 -0.044844 -0.047471 ... 1.227396 1.253407 1.283604 1.271473 1.287577 1.313725 1.316089 1.324185 1.251984 1.254192
47 -0.059905 -0.053893 -0.049825 -0.045788 -0.039896 -0.037613 -0.035854 -0.039694 -0.043639 -0.046829 ... 1.197319 1.213938 1.249290 1.244735 1.267019 1.273849 1.284502 1.297106 1.226739 1.219197
48 -0.060446 -0.054912 -0.051417 -0.046888 -0.042582 -0.040267 -0.038564 -0.041482 -0.045056 -0.047595 ... 1.223806 1.258589 1.293267 1.280068 1.289178 1.307505 1.312363 1.290606 1.246904 1.244676
49 -0.060961 -0.056118 -0.052393 -0.048156 -0.043868 -0.041965 -0.040130 -0.043010 -0.046227 -0.048838 ... 1.224987 1.243103 1.274070 1.269862 1.281093 1.287953 1.297078 1.301496 1.237601 1.227748
50 -0.052634 -0.046971 -0.043205 -0.039538 -0.034724 -0.032414 -0.029820 -0.033670 -0.036481 -0.040107 ... 1.205736 1.222295 1.238942 1.245682 1.191846 1.198974 1.180956 1.176291 1.152654 1.170770
51 -0.052700 -0.047331 -0.043577 -0.040344 -0.035613 -0.033652 -0.031383 -0.034717 -0.037500 -0.040242 ... 1.231920 1.238705 1.261821 1.289389 1.218379 1.212102 1.235187 1.215633 1.222602 1.183570
52 -0.053394 -0.047990 -0.044082 -0.040690 -0.036469 -0.033663 -0.031475 -0.034740 -0.037418 -0.040692 ... 1.201475 1.212914 1.228477 1.242877 1.190009 1.196491 1.198856 1.165040 1.189147 1.150035
53 -0.054134 -0.048487 -0.045171 -0.041012 -0.035553 -0.034104 -0.031523 -0.034866 -0.037200 -0.038979 ... 1.170220 1.201926 1.162770 1.173205 1.162726 1.168570 1.148061 1.167755 1.137953 1.145351
54 -0.049623 -0.044263 -0.041154 -0.037335 -0.032926 -0.030602 -0.028713 -0.032058 -0.034363 -0.035440 ... 1.182659 1.202795 1.229580 1.170451 1.189367 1.173706 1.198534 1.162526 1.195273 1.156451
55 -0.046884 -0.042360 -0.038683 -0.035291 -0.030175 -0.027898 -0.026519 -0.029413 -0.032182 -0.034033 ... 1.206536 1.223194 1.247032 1.300765 1.207600 1.231914 1.245133 1.230234 1.264217 1.190865
56 -0.055555 -0.049867 -0.045942 -0.042266 -0.037195 -0.034837 -0.031842 -0.036051 -0.038897 -0.042842 ... 1.167444 1.193289 1.209944 1.175943 1.159782 1.184718 1.155629 1.175611 1.117087 1.095777
57 -0.053693 -0.048020 -0.044677 -0.041021 -0.036254 -0.034531 -0.032428 -0.035264 -0.038362 -0.040816 ... 1.217198 1.222375 1.238392 1.252411 1.195963 1.210064 1.199746 1.173102 1.191871 1.150779
58 -0.056311 -0.051231 -0.047483 -0.044605 -0.039404 -0.037526 -0.034336 -0.037852 -0.041023 -0.044488 ... 1.247442 1.237687 1.246042 1.253986 1.211382 1.203032 1.209177 1.183871 1.175997 1.154696
59 -0.058805 -0.053311 -0.049543 -0.045053 -0.040598 -0.038965 -0.036749 -0.040284 -0.042080 -0.045058 ... 1.211312 1.228345 1.237367 1.203006 1.200348 1.209557 1.182911 1.184077 1.154355 1.163959

60 rows × 401 columns


In [3]:
# Load response data, that is octane measurements
y_df = pd.read_csv('gasoline_octane.txt', header=None, sep='\s+')
y_df


Out[3]:
0
0 85.30
1 85.25
2 88.45
3 83.40
4 87.90
5 85.50
6 88.90
7 88.30
8 88.70
9 88.45
10 88.75
11 88.25
12 87.30
13 88.00
14 88.70
15 85.50
16 88.65
17 88.75
18 85.40
19 88.60
20 87.00
21 87.15
22 87.05
23 87.25
24 86.85
25 88.65
26 86.60
27 86.00
28 86.10
29 86.50
30 86.30
31 84.40
32 84.70
33 84.60
34 84.50
35 88.10
36 85.25
37 88.40
38 88.20
39 88.40
40 88.55
41 88.35
42 88.20
43 85.30
44 88.50
45 88.25
46 88.00
47 88.85
48 88.45
49 88.70
50 88.10
51 87.60
52 88.35
53 85.10
54 85.10
55 84.70
56 87.20
57 86.60
58 89.60
59 87.10

The nipalsPLS1 class in hoggorm accepts only numpy arrays with numerical values and not pandas data frames. Therefore, the pandas data frames holding the imported data need to be "taken apart" into three parts:

  • two numpy array holding the numeric values
  • two Python list holding variable (column) names
  • two Python list holding object (row) names.

The numpy arrays with values will be used as input for the nipalsPLS2 class for analysis. The Python lists holding the variable and row names will be used later in the plotting function from the hoggormPlot package when visualising the results of the analysis. Below is the code needed to access both data, variable names and object names.


In [4]:
# Get the values from the data frame
X = X_df.values
y = y_df.values

# Get the variable or columns names
X_varNames = list(X_df.columns)
y_varNames = list(y_df.columns)

# Get the object or row names
X_objNames = list(X_df.index)
y_objNames = list(y_df.index)

Apply PLSR to our data

Now, let's run PLSR on the data using the nipalsPLS1 class, since we have a univariate response. The documentation provides a description of the input parameters. Using input paramter arrX and vecy we define which numpy array we would like to analyse. vecy is what typically is considered to be the response vector, while the measurements are typically defined as arrX. By setting input parameter Xstand=False we make sure that the variables are only mean centered, not scaled to unit variance, if this is what you want. This is the default setting and actually doesn't need to expressed explicitly. Setting paramter cvType=["loo"] we make sure that we compute the PLS2 model using full cross validation. "loo" means "Leave One Out". By setting paramter numpComp=10 we ask for four components to be computed.


In [5]:
model = ho.nipalsPLS1(arrX=X, Xstand=False, 
                      vecy=y,
                      cvType=["loo"], 
                      numComp=10)


loo

That's it, the PLS2 model has been computed. Now we would like to inspect the results by visualising them. We can do this using plotting functions of the separate hoggormPlot package. If we wish to plot the results for component 1 and component 2, we can do this by setting the input argument comp=[1, 2]. The input argument plots=[1, 6] lets the user define which plots are to be plotted. If this list for example contains value 1, the function will generate the scores plot for the model. If the list contains value 6 the explained variance plot for y will be plotted. The hoggormPlot documentation provides a description of input paramters.


In [6]:
hop.plot(model, comp=[1, 2], 
         plots=[1, 6], 
         objNames=X_objNames, 
         XvarNames=X_varNames,
         YvarNames=y_varNames)


Plots can also be called separately.


In [7]:
# Plot cumulative explained variance (both calibrated and validated) using a specific function for that.
hop.explainedVariance(model)



In [8]:
# Plot cumulative validated explained variance in X.
hop.explainedVariance(model, which=['X'])



In [9]:
hop.scores(model)



In [10]:
# Plot X loadings in line plot
hop.loadings(model, weights=True, line=True)



In [11]:
# Plot regression coefficients
hop.coefficients(model, comp=[3])



Accessing numerical results

Now that we have visualised the PLSR results, we may also want to access the numerical results. Below are some examples. For a complete list of accessible results, please see this part of the documentation.


In [12]:
# Get X scores and store in numpy array
X_scores = model.X_scores()

# Get scores and store in pandas dataframe with row and column names
X_scores_df = pd.DataFrame(model.X_scores())
X_scores_df.index = X_objNames
X_scores_df.columns = ['Comp {0}'.format(x+1) for x in range(model.X_scores().shape[1])]
X_scores_df


Out[12]:
Comp 1 Comp 2 Comp 3 Comp 4 Comp 5 Comp 6 Comp 7 Comp 8 Comp 9 Comp 10
0 -0.057240 -0.090090 -0.016732 0.091040 0.013031 -0.034006 0.003680 0.021503 0.004683 -0.011914
1 -0.486129 -0.001680 -0.000860 0.062807 0.021144 0.021133 0.009727 0.039915 0.004400 -0.005882
2 -0.341384 0.138518 0.020314 0.041364 0.005027 0.007527 -0.002081 -0.004655 -0.008124 -0.001083
3 -0.378394 -0.094977 -0.018236 0.066672 -0.002121 0.004055 -0.009803 0.006046 -0.003197 -0.020651
4 0.171442 0.017964 0.016884 0.120622 -0.055998 -0.020644 -0.016979 -0.017157 -0.010593 -0.000496
5 -0.126172 -0.065053 -0.015081 0.115024 -0.013342 0.001092 0.004512 -0.000920 -0.000514 -0.006383
6 0.121343 0.050029 0.039788 0.083844 -0.017277 0.013293 0.008127 0.014991 0.001873 0.000811
7 -0.145441 0.102092 0.019177 0.019986 -0.012579 0.005683 0.007354 0.002640 -0.007553 0.008735
8 0.056069 0.064077 0.046319 0.064288 -0.023545 0.004753 0.003547 0.011940 0.002338 -0.002152
9 0.127782 0.030740 0.046912 0.076654 -0.009413 0.025236 0.000550 0.014510 -0.002370 -0.003798
10 0.283917 -0.025611 0.037586 0.105342 0.008823 0.055495 -0.008976 0.028765 0.002213 -0.007174
11 0.132823 -0.004744 0.012583 0.118708 0.024965 0.015015 0.008671 -0.035222 0.010408 -0.004303
12 -0.146030 0.049516 -0.018497 0.045019 0.025097 0.009829 0.008813 -0.035832 -0.011225 -0.003336
13 -0.294127 0.112916 0.057625 0.004442 -0.000603 0.000696 -0.005702 0.003505 0.000959 0.009543
14 -0.484048 0.198994 0.039234 -0.002232 0.013751 -0.023892 -0.018458 0.007179 -0.010533 0.000220
15 -0.024532 -0.072044 -0.062595 0.091360 -0.002514 -0.017144 -0.006661 -0.005209 -0.003710 -0.000025
16 0.087239 0.018951 0.041797 0.051972 0.006826 0.027721 0.009518 -0.000653 0.007880 -0.010206
17 0.064074 0.050233 0.034363 -0.001429 -0.006952 0.028045 0.001014 0.005736 0.009154 0.011074
18 -0.175658 -0.031660 -0.022259 -0.051654 -0.009812 0.016842 -0.007928 -0.000900 -0.003654 0.000715
19 -0.118583 0.094621 0.002621 -0.040202 -0.003955 0.023443 -0.001712 -0.012833 -0.005950 0.008221
20 0.017487 -0.016868 -0.016008 -0.049640 -0.001150 0.001089 0.005482 -0.039279 0.008824 -0.007078
21 -0.055181 0.019534 0.022359 0.111406 -0.019072 -0.056374 0.009340 -0.026926 0.006666 -0.001631
22 -0.064150 0.012390 -0.003453 -0.052621 0.000012 -0.008672 0.006181 -0.038191 0.013566 -0.004884
23 -0.016391 0.027685 -0.010194 -0.051421 -0.005671 -0.007287 0.012703 -0.038833 0.007603 -0.004020
24 -0.034584 0.004758 -0.022130 -0.050043 -0.011241 0.005975 0.011260 -0.029492 0.002184 -0.006574
25 0.078784 0.057225 -0.000607 0.013622 -0.012585 0.020744 0.009094 0.003405 -0.002316 0.008422
26 -0.026699 -0.026554 -0.000362 -0.022051 -0.004145 0.027242 -0.005203 0.003113 0.001824 0.008822
27 0.011586 -0.052377 -0.037467 -0.026987 -0.013350 0.020434 -0.001479 0.000560 0.002034 0.010439
28 -0.031070 -0.023235 -0.027503 -0.049017 -0.023176 0.003589 -0.000396 0.001178 0.003202 0.009975
29 0.077652 -0.040042 -0.046276 -0.023969 -0.018739 0.021650 0.003696 0.004051 0.003042 0.005030
30 0.028432 -0.032889 -0.040399 -0.033557 -0.020945 0.011621 0.005313 -0.001489 -0.000883 0.003274
31 -0.203131 -0.082005 -0.056790 -0.055040 -0.007364 0.014242 -0.001922 -0.000860 -0.001597 0.002010
32 -0.166530 -0.082196 -0.048255 -0.098160 -0.007547 0.018499 -0.010616 0.002018 0.000532 0.010185
33 -0.211403 -0.085282 -0.046087 -0.046889 -0.006093 0.012115 -0.006618 -0.004361 0.001505 0.003703
34 -0.181405 -0.089423 -0.071121 -0.037834 -0.004776 0.016990 0.000507 -0.005531 -0.001940 0.005264
35 0.084050 0.025503 0.008988 -0.012449 0.021442 0.010335 -0.006182 -0.035635 0.005572 -0.001211
36 -0.023333 -0.086071 -0.065047 0.026411 -0.008683 0.012375 0.004227 0.005002 -0.007071 0.003166
37 0.111238 0.034462 -0.063162 0.052194 0.020542 -0.005429 -0.002415 -0.000091 0.000958 -0.001189
38 -0.083889 0.091520 -0.065973 0.001635 0.022613 -0.012033 -0.006220 0.011930 0.020789 0.010105
39 0.073678 0.053115 -0.067604 0.019737 0.022597 -0.004273 -0.008880 0.007210 0.009353 0.001120
40 0.389533 -0.022513 0.011400 0.087437 0.010283 0.002617 -0.013347 -0.018383 -0.004476 -0.015529
41 0.210720 0.017797 0.032465 0.057376 -0.016125 -0.020262 -0.006735 0.008942 0.003591 -0.001612
42 0.095287 0.028884 -0.026179 0.118673 0.000927 -0.024172 0.000944 0.010437 -0.011649 -0.007057
43 -0.128704 -0.056362 -0.084836 0.033274 0.006978 -0.024459 -0.004423 0.018743 0.016799 0.006014
44 0.220443 0.018592 -0.004791 0.005978 -0.010956 0.016994 -0.001134 0.024886 0.000331 0.000523
45 0.219291 0.041352 -0.091219 -0.105915 0.015803 0.001714 -0.036045 -0.019889 -0.007238 0.001229
46 0.209593 0.037354 -0.134900 -0.056759 0.017195 -0.008188 0.012568 0.015388 -0.019107 -0.028342
47 0.078167 0.097501 -0.076276 -0.126336 -0.024734 -0.051872 0.019710 0.029958 0.015477 0.002435
48 0.235977 0.033361 -0.119406 -0.070017 0.010898 -0.015425 0.000953 0.007787 -0.008169 -0.004561
49 0.284036 0.033558 -0.098485 -0.116645 0.008853 -0.015363 0.001648 0.027408 -0.001831 -0.005951
50 0.080232 0.001823 0.121950 -0.041355 0.002369 -0.011051 0.013891 -0.006345 -0.003725 0.000920
51 0.205620 -0.057630 0.052606 0.022914 0.022101 -0.017622 0.016280 0.005756 0.000442 0.026748
52 0.171579 -0.011562 0.136744 -0.018492 0.024405 0.003082 -0.005239 0.016616 0.002631 0.012430
53 -0.210225 -0.091700 0.129548 -0.109046 0.016759 0.008480 0.011845 0.007887 0.003853 -0.025255
54 -0.054898 -0.119394 0.119140 -0.002232 0.002841 -0.053912 -0.029537 0.005870 0.014698 -0.002829
55 -0.086952 -0.127362 -0.007576 0.057361 0.025239 -0.034217 0.019641 -0.002755 -0.026141 0.029705
56 -0.027305 -0.000624 0.156660 -0.170706 -0.016703 -0.020226 -0.000228 0.014700 -0.014637 -0.018372
57 0.101487 -0.059282 0.087219 -0.060648 0.004720 -0.015955 -0.009277 -0.006570 -0.013957 0.014703
58 0.283434 0.021621 0.105741 -0.080168 0.017565 0.007559 -0.000634 0.000637 0.000963 0.013142
59 0.070593 -0.037455 0.086341 -0.103647 -0.001638 0.005275 0.004029 -0.002201 0.001811 -0.015185

In [13]:
help(ho.nipalsPLS1.X_scores)


Help on function X_scores in module hoggorm.plsr1:

X_scores(self)
    Returns array holding scores of array X. First column holds scores
    for component 1, second column holds scores for component 2, etc.


In [14]:
# Dimension of the X_scores
np.shape(model.X_scores())


Out[14]:
(60, 10)

We see that the numpy array holds the scores for all countries and OECD (35 in total) for four components as required when computing the PCA model.


In [15]:
# Get X loadings and store in numpy array
X_loadings = model.X_loadings()

# Get X loadings and store in pandas dataframe with row and column names
X_loadings_df = pd.DataFrame(model.X_loadings())
X_loadings_df.index = X_varNames
X_loadings_df.columns = ['Comp {0}'.format(x+1) for x in range(model.X_loadings().shape[1])]
X_loadings_df


Out[15]:
Comp 1 Comp 2 Comp 3 Comp 4 Comp 5 Comp 6 Comp 7 Comp 8 Comp 9 Comp 10
0 -0.011918 0.007621 0.019273 0.047304 0.019683 -0.045471 0.000507 0.013446 -0.041192 0.024507
1 -0.011278 0.009904 0.019503 0.045656 0.015488 -0.045307 -0.012627 0.014525 -0.036827 0.024944
2 -0.012010 0.011793 0.018865 0.045780 0.011099 -0.042409 -0.006870 0.015516 -0.034478 0.034811
3 -0.013388 0.016378 0.017969 0.047444 0.004525 -0.036736 -0.004848 0.005382 -0.029365 0.020440
4 -0.014011 0.017712 0.019152 0.043833 0.007695 -0.041951 0.005582 0.008492 -0.028519 0.028966
5 -0.015115 0.019959 0.017798 0.044347 0.013455 -0.040313 -0.001962 0.011815 -0.031038 0.038807
6 -0.014493 0.021493 0.022897 0.041778 0.026043 -0.044737 0.006565 0.012279 -0.038584 0.026981
7 -0.015248 0.018863 0.017406 0.044447 0.025468 -0.039919 -0.000440 0.018211 -0.022834 0.039156
8 -0.014415 0.011660 0.020497 0.043304 0.019171 -0.033132 0.003315 0.011121 -0.020662 0.028141
9 -0.013795 0.002562 0.016927 0.043338 0.024038 -0.041173 -0.001926 0.015344 -0.016366 0.027566
10 -0.013790 -0.003899 0.015857 0.042066 0.013236 -0.038216 -0.005971 0.017316 -0.033056 0.027413
11 -0.013687 -0.008479 0.016984 0.042490 0.012955 -0.040393 -0.007965 0.013970 -0.028006 0.036972
12 -0.013461 -0.008926 0.018095 0.043683 0.013601 -0.043129 -0.009765 0.013411 -0.026968 0.030471
13 -0.012640 -0.008807 0.019135 0.042502 0.010309 -0.044501 -0.009716 0.016124 -0.034152 0.028230
14 -0.012441 -0.006556 0.017637 0.043464 0.006883 -0.041923 -0.011984 0.015757 -0.029139 0.029123
15 -0.012013 -0.005794 0.013648 0.042672 0.005461 -0.042046 -0.015081 0.022979 -0.031200 0.031476
16 -0.011423 -0.005236 0.014871 0.042514 0.005769 -0.042289 -0.021372 0.021102 -0.028704 0.036060
17 -0.011547 -0.003946 0.015183 0.042366 0.007537 -0.045050 -0.021718 0.023865 -0.032366 0.024337
18 -0.011145 -0.002299 0.014665 0.043410 0.001490 -0.041701 -0.020530 0.022133 -0.041722 0.032589
19 -0.010674 0.000046 0.014342 0.042640 0.004416 -0.042562 -0.019600 0.021508 -0.035705 0.026248
20 -0.010629 0.000876 0.013993 0.042652 0.006213 -0.042727 -0.022673 0.020701 -0.031429 0.029811
21 -0.010299 0.001906 0.014061 0.043225 0.003777 -0.042738 -0.014787 0.022905 -0.034910 0.026876
22 -0.009597 0.002628 0.015718 0.042848 0.003187 -0.040629 -0.019261 0.018771 -0.039871 0.025547
23 -0.009332 0.002630 0.014744 0.044176 0.002936 -0.041377 -0.021044 0.021359 -0.040188 0.030496
24 -0.009256 0.003219 0.015878 0.045761 0.002463 -0.038715 -0.022125 0.021466 -0.035265 0.029375
25 -0.009123 0.003242 0.015716 0.044611 0.003426 -0.038458 -0.019327 0.020956 -0.038139 0.022171
26 -0.008415 0.003621 0.015806 0.043653 0.008996 -0.039841 -0.007806 0.022211 -0.031418 0.013794
27 -0.008701 0.003035 0.012569 0.044802 0.009448 -0.037943 -0.018915 0.026775 -0.026545 0.022653
28 -0.008507 0.003281 0.013535 0.045041 0.005423 -0.035869 -0.017230 0.028404 -0.034419 0.023682
29 -0.008373 0.004293 0.014244 0.045612 0.009357 -0.035782 -0.017362 0.025154 -0.031636 0.018838
... ... ... ... ... ... ... ... ... ... ...
371 0.058714 -0.138665 0.003552 0.071332 0.050123 0.070982 -0.079044 0.004860 -0.021090 -0.072115
372 0.075047 -0.119426 0.003188 0.068639 0.039583 0.045919 -0.077109 0.020342 -0.011821 -0.062951
373 0.084855 -0.105782 0.003808 0.071787 0.033679 0.032464 -0.066627 0.041360 -0.014921 -0.069034
374 0.090239 -0.096044 -0.001161 0.070656 -0.001297 0.043575 -0.059182 0.030187 0.020288 -0.089983
375 0.095792 -0.094667 -0.003714 0.070969 -0.011166 0.040796 -0.046323 0.027112 0.022741 -0.086770
376 0.098699 -0.100842 0.000317 0.071957 -0.020062 0.057465 -0.044506 0.024234 0.080379 -0.068341
377 0.108262 -0.102346 -0.002461 0.073821 -0.011112 0.063105 -0.037656 0.013802 0.050518 -0.088610
378 0.126057 -0.107993 0.004563 0.078986 -0.011223 0.054047 -0.054342 0.022670 0.026558 -0.051869
379 0.149611 -0.118343 -0.000500 0.070456 0.008814 0.024194 -0.052003 0.024639 -0.064112 -0.108326
380 0.173272 -0.125030 0.021623 0.075129 -0.045754 0.035763 -0.048247 0.041362 0.085941 -0.039957
381 0.200316 -0.125798 0.033334 0.084225 -0.073860 0.033662 -0.041959 0.053171 0.114114 -0.050683
382 0.241332 -0.130074 0.039180 0.081087 -0.027109 0.025764 -0.053117 0.055647 -0.016971 0.043849
383 0.267735 -0.125140 0.018419 0.072000 -0.042601 -0.062039 -0.013058 0.104007 -0.041720 0.067759
384 0.284640 -0.133377 0.039743 0.075344 -0.063367 -0.106423 -0.001096 0.098335 0.064889 -0.085466
385 0.289782 -0.123593 -0.003505 0.054858 -0.087871 -0.087095 0.075877 0.035098 -0.016161 -0.119966
386 0.272392 -0.134044 0.031182 0.079940 -0.159108 -0.041620 0.190766 -0.010411 0.231032 0.012578
387 0.264021 -0.146889 0.050151 0.088004 -0.179262 0.029731 0.267434 -0.137610 0.001129 0.121097
388 0.238007 -0.128394 -0.026578 0.055632 -0.136168 -0.013453 0.154353 -0.149939 0.016601 -0.178122
389 0.203524 -0.130341 -0.015576 0.072225 -0.109685 -0.039921 0.272150 -0.213191 0.118227 0.181658
390 0.175706 -0.109369 -0.033290 0.089568 0.000557 -0.050968 0.317260 -0.330860 0.096929 0.380165
391 0.140161 -0.080590 -0.062353 0.099477 -0.245879 0.117496 0.222701 -0.280111 0.012480 0.088650
392 0.123478 -0.046574 -0.094875 0.062064 0.144741 -0.169050 0.193390 -0.152980 -0.102499 -0.209529
393 0.110848 -0.017989 -0.155051 0.059228 0.062057 -0.173169 0.119081 -0.274974 -0.261745 -0.010477
394 0.071245 0.008760 -0.217507 0.157241 0.088852 0.090935 0.488841 -0.318447 -0.480456 0.774179
395 0.034641 0.118304 -0.403385 0.222151 -0.389310 0.338697 -0.196545 -0.437624 0.686514 -0.276956
396 0.002153 0.128216 -0.448989 0.160797 -0.592549 0.503738 -0.433831 0.656736 -0.501059 0.190095
397 0.054906 0.127084 -0.448479 0.079261 0.658194 -0.555266 -0.235754 0.163366 0.194632 0.012960
398 0.036269 0.093630 -0.402423 0.043285 0.048608 -0.228573 0.677732 0.088261 -0.226257 -0.553841
399 0.002972 0.031740 -0.329555 0.180450 0.065504 -0.076121 -0.302295 0.207710 -0.350270 0.687404
400 0.017259 0.088132 -0.373250 0.138630 0.300312 -0.077109 -0.014528 -0.157472 0.452258 -0.253576

401 rows × 10 columns


In [16]:
help(ho.nipalsPLS1.X_loadings)


Help on function X_loadings in module hoggorm.plsr1:

X_loadings(self)
    Returns array holding loadings of array X. Rows represent variables
    and columns represent components. First column holds loadings for
    component 1, second column holds scores for component 2, etc.


In [17]:
np.shape(model.X_loadings())


Out[17]:
(401, 10)

Here we see that the array holds the loadings for the 10 variables in the data across four components.


In [19]:
# Get Y loadings and store in numpy array
Y_loadings = model.Y_loadings()

# Get Y loadings and store in pandas dataframe with row and column names
Y_loadings_df = pd.DataFrame(model.Y_loadings())
Y_loadings_df.index = y_varNames
Y_loadings_df.columns = ['Comp {0}'.format(x+1) for x in range(model.Y_loadings().shape[1])]
Y_loadings_df


Out[19]:
Comp 1 Comp 2 Comp 3 Comp 4 Comp 5 Comp 6 Comp 7 Comp 8 Comp 9 Comp 10
0 4.65396 18.228837 4.161681 1.186265 7.689862 3.58874 5.272935 1.773517 5.300957 3.147329

In [20]:
# Get X correlation loadings and store in numpy array
X_corrloadings = model.X_corrLoadings()

# Get X correlation loadings and store in pandas dataframe with row and column names
X_corrloadings_df = pd.DataFrame(model.X_corrLoadings())
X_corrloadings_df.index = X_varNames
X_corrloadings_df.columns = ['Comp {0}'.format(x+1) for x in range(model.X_corrLoadings().shape[1])]
X_corrloadings_df


Out[20]:
Comp 1 Comp 2 Comp 3 Comp 4 Comp 5 Comp 6 Comp 7 Comp 8 Comp 9 Comp 10
0 -0.492342 0.112736 0.275011 0.747324 0.071349 -0.216682 0.001182 0.053511 -0.079134 0.057469
1 -0.479435 0.150768 0.286394 0.742271 0.057778 -0.222187 -0.030289 0.059486 -0.072808 0.060195
2 -0.502574 0.176706 0.272678 0.732633 0.040754 -0.204713 -0.016221 0.062550 -0.067095 0.082691
3 -0.536563 0.235041 0.248755 0.727172 0.015913 -0.169837 -0.010964 0.020779 -0.054730 0.046503
4 -0.565895 0.256156 0.267199 0.677062 0.027271 -0.195453 0.012720 0.033044 -0.053567 0.066410
5 -0.588316 0.278169 0.239277 0.660098 0.045954 -0.180994 -0.004308 0.044300 -0.056180 0.085740
6 -0.565967 0.300539 0.308849 0.623901 0.089239 -0.201518 0.014465 0.046194 -0.070068 0.059809
7 -0.587579 0.260279 0.231680 0.655009 0.086117 -0.177445 -0.000957 0.067604 -0.040919 0.085653
8 -0.588198 0.170372 0.288906 0.675770 0.068644 -0.155955 0.007632 0.043717 -0.039208 0.065185
9 -0.581043 0.038642 0.246265 0.698065 0.088841 -0.200041 -0.004578 0.062259 -0.032056 0.065908
10 -0.594093 -0.060147 0.235965 0.693070 0.050038 -0.189919 -0.014516 0.071867 -0.066227 0.067041
11 -0.580394 -0.128751 0.248767 0.689041 0.048203 -0.197584 -0.019058 0.057067 -0.055228 0.088994
12 -0.563905 -0.133900 0.261845 0.699831 0.049997 -0.208418 -0.023080 0.054123 -0.052537 0.072460
13 -0.546685 -0.136404 0.285871 0.703018 0.039126 -0.222028 -0.023710 0.067185 -0.068693 0.069311
14 -0.541917 -0.102256 0.265354 0.724006 0.026308 -0.210640 -0.029452 0.066118 -0.059024 0.072008
15 -0.539551 -0.093182 0.211738 0.732961 0.021523 -0.217840 -0.038218 0.099428 -0.065167 0.080251
16 -0.520392 -0.085414 0.233998 0.740665 0.023060 -0.222230 -0.054934 0.092610 -0.060810 0.093251
17 -0.523367 -0.064035 0.237691 0.734327 0.029976 -0.235530 -0.055539 0.104200 -0.068218 0.062614
18 -0.505800 -0.037356 0.229898 0.753426 0.005935 -0.218313 -0.052572 0.096769 -0.088055 0.083956
19 -0.496709 0.000758 0.230539 0.758837 0.018030 -0.228471 -0.051463 0.096420 -0.077268 0.069335
20 -0.495125 0.014618 0.225153 0.759842 0.025397 -0.229595 -0.059595 0.092901 -0.068085 0.078830
21 -0.480005 0.031810 0.226368 0.770473 0.015449 -0.229783 -0.038889 0.102847 -0.075669 0.071106
22 -0.455784 0.044686 0.257847 0.778236 0.013283 -0.222584 -0.051615 0.085883 -0.088059 0.068872
23 -0.437635 0.044164 0.238834 0.792272 0.012080 -0.223832 -0.055684 0.096494 -0.087644 0.081182
24 -0.425852 0.053023 0.252324 0.805149 0.009942 -0.205467 -0.057433 0.095139 -0.075451 0.076715
25 -0.429266 0.054623 0.255427 0.802737 0.014147 -0.208739 -0.051310 0.094990 -0.083453 0.059217
26 -0.406557 0.062641 0.263783 0.806603 0.038140 -0.222049 -0.021281 0.103384 -0.070592 0.037832
27 -0.415282 0.051872 0.207219 0.817740 0.039568 -0.208900 -0.050938 0.123107 -0.058918 0.061372
28 -0.406148 0.056088 0.223213 0.822394 0.022718 -0.197549 -0.046416 0.130644 -0.076422 0.064183
29 -0.397845 0.073042 0.233776 0.828779 0.039013 -0.196114 -0.046545 0.115135 -0.069902 0.050807
... ... ... ... ... ... ... ... ... ... ...
371 0.710942 -0.601220 0.014856 0.330315 0.053256 0.099145 -0.054003 0.005670 -0.011876 -0.049567
372 0.825925 -0.470635 0.012119 0.288891 0.038226 0.058296 -0.047882 0.021567 -0.006050 -0.039327
373 0.870515 -0.388587 0.013496 0.281640 0.030318 0.038418 -0.038567 0.040876 -0.007118 -0.040202
374 0.894801 -0.341021 -0.003977 0.267939 -0.001128 0.049842 -0.033112 0.028837 0.009355 -0.050650
375 0.906630 -0.320830 -0.012143 0.256873 -0.009274 0.044540 -0.024738 0.024720 0.010009 -0.046618
376 0.903546 -0.330565 0.001002 0.251920 -0.016116 0.060684 -0.022989 0.021372 0.034219 -0.035514
377 0.914938 -0.309716 -0.007185 0.238588 -0.008241 0.061520 -0.017956 0.011237 0.019855 -0.042509
378 0.928162 -0.284730 0.011606 0.222412 -0.007251 0.045905 -0.022576 0.016080 0.009094 -0.021679
379 0.943891 -0.267350 -0.001090 0.169993 0.004880 0.017608 -0.018512 0.014975 -0.018810 -0.038795
380 0.953180 -0.246286 0.041088 0.158055 -0.022086 0.022694 -0.014975 0.021920 0.021986 -0.012477
381 0.959943 -0.215864 0.055178 0.154356 -0.031058 0.018608 -0.011345 0.024547 0.025431 -0.013787
382 0.970720 -0.187347 0.054436 0.124733 -0.009568 0.011954 -0.012055 0.021563 -0.003175 0.010012
383 0.978598 -0.163785 0.023254 0.100643 -0.013663 -0.026157 -0.002693 0.036622 -0.007091 0.014059
384 0.976619 -0.163865 0.047102 0.098862 -0.019078 -0.042121 -0.000212 0.032503 0.010354 -0.016646
385 0.983128 -0.150144 -0.004107 0.071176 -0.026159 -0.034085 0.014525 0.011471 -0.002550 -0.023104
386 0.973296 -0.171505 0.038486 0.109237 -0.049887 -0.017155 0.038460 -0.003584 0.038390 0.002551
387 0.963281 -0.191903 0.063202 0.122792 -0.057391 0.012513 0.055055 -0.048367 0.000192 0.025080
388 0.966050 -0.186609 -0.037262 0.086355 -0.048498 -0.006299 0.035350 -0.058629 0.003134 -0.041040
389 0.950396 -0.217946 -0.025124 0.128982 -0.044945 -0.021504 0.071707 -0.095906 0.025674 0.048153
390 0.925504 -0.206283 -0.060568 0.180425 0.000258 -0.030968 0.094291 -0.167889 0.023743 0.113670
391 0.890165 -0.183273 -0.136788 0.241612 -0.137027 0.086079 0.079804 -0.171380 0.003686 0.031960
392 0.882143 -0.119144 -0.234124 0.169568 0.090737 -0.139315 0.077955 -0.105286 -0.034054 -0.084972
393 0.824393 -0.047906 -0.398315 0.168457 0.040499 -0.148563 0.049970 -0.197008 -0.090527 -0.004423
394 0.508311 0.022381 -0.536034 0.429035 0.055626 0.074840 0.196790 -0.218876 -0.159412 0.313544
395 0.181091 0.221452 -0.728396 0.444123 -0.178583 0.204243 -0.057973 -0.220389 0.166896 -0.082186
396 0.010479 0.223425 -0.754729 0.299254 -0.253033 0.282780 -0.119123 0.307885 -0.113395 0.052513
397 0.279139 0.231349 -0.787569 0.154103 0.293627 -0.325638 -0.067628 0.080011 0.046016 0.003740
398 0.221578 0.204825 -0.849210 0.101129 0.026058 -0.161081 0.233619 0.051945 -0.064281 -0.192068
399 0.020141 0.077018 -0.771391 0.467638 0.038950 -0.059503 -0.115583 0.135596 -0.110382 0.264421
400 0.113374 0.207306 -0.846919 0.348264 0.173106 -0.058430 -0.005385 -0.099652 0.138158 -0.094556

401 rows × 10 columns


In [21]:
help(ho.nipalsPLS1.X_corrLoadings)


Help on function X_corrLoadings in module hoggorm.plsr1:

X_corrLoadings(self)
    Returns array holding correlation loadings of array X. First column
    holds correlation loadings for component 1, second column holds
    correlation loadings for component 2, etc.


In [23]:
# Get Y loadings and store in numpy array
Y_corrloadings = model.X_corrLoadings()

# Get Y loadings and store in pandas dataframe with row and column names
Y_corrloadings_df = pd.DataFrame(model.Y_corrLoadings())
Y_corrloadings_df.index = y_varNames
Y_corrloadings_df.columns = ['Comp {0}'.format(x+1) for x in range(model.Y_corrLoadings().shape[1])]
Y_corrloadings_df


Out[23]:
Comp 1 Comp 2 Comp 3 Comp 4 Comp 5 Comp 6 Comp 7 Comp 8 Comp 9 Comp 10
0 0.564836 0.792202 0.174467 0.05506 0.081895 0.050243 0.036109 0.020736 0.029919 0.021683

In [24]:
help(ho.nipalsPLS1.Y_corrLoadings)


Help on function Y_corrLoadings in module hoggorm.plsr1:

Y_corrLoadings(self)
    Returns an array holding correlation loadings of vector y. Columns
    represent components. First column for component 1, second columns for
    component 2, etc.


In [25]:
# Get calibrated explained variance of each component in X
X_calExplVar = model.X_calExplVar()

# Get calibrated explained variance in X and store in pandas dataframe with row and column names
X_calExplVar_df = pd.DataFrame(model.X_calExplVar())
X_calExplVar_df.columns = ['calibrated explained variance in X']
X_calExplVar_df.index = ['Comp {0}'.format(x+1) for x in range(model.X_loadings().shape[1])]
X_calExplVar_df


Out[25]:
calibrated explained variance in X
Comp 1 70.965644
Comp 2 7.594396
Comp 3 7.587184
Comp 4 9.253793
Comp 5 0.720196
Comp 6 0.847295
Comp 7 0.353865
Comp 8 0.781099
Comp 9 0.218476
Comp 10 0.387837

In [26]:
help(ho.nipalsPLS1.X_calExplVar)


Help on function X_calExplVar in module hoggorm.plsr1:

X_calExplVar(self)
    Returns a list holding the calibrated explained variance for
    each component. First number in list is for component 1, second number
    for component 2, etc.


In [27]:
# Get calibrated explained variance of each component in Y
Y_calExplVar = model.Y_calExplVar()

# Get calibrated explained variance in Y and store in pandas dataframe with row and column names
Y_calExplVar_df = pd.DataFrame(model.Y_calExplVar())
Y_calExplVar_df.columns = ['calibrated explained variance in Y']
Y_calExplVar_df.index = ['Comp {0}'.format(x+1) for x in range(model.Y_loadings().shape[1])]
Y_calExplVar_df


Out[27]:
calibrated explained variance in Y
Comp 1 31.903929
Comp 2 62.758430
Comp 3 3.043863
Comp 4 0.303157
Comp 5 0.670684
Comp 6 0.252434
Comp 7 0.130385
Comp 8 0.042997
Comp 9 0.089514
Comp 10 0.047016

In [28]:
help(ho.nipalsPLS1.Y_calExplVar)


Help on function Y_calExplVar in module hoggorm.plsr1:

Y_calExplVar(self)
    Returns list holding calibrated explained variance for each component
    in vector y.


In [29]:
# Get cumulative calibrated explained variance in X
X_cumCalExplVar = model.X_cumCalExplVar()

# Get cumulative calibrated explained variance in X and store in pandas dataframe with row and column names
X_cumCalExplVar_df = pd.DataFrame(model.X_cumCalExplVar())
X_cumCalExplVar_df.columns = ['cumulative calibrated explained variance in X']
X_cumCalExplVar_df.index = ['Comp {0}'.format(x) for x in range(model.X_loadings().shape[1] + 1)]
X_cumCalExplVar_df


Out[29]:
cumulative calibrated explained variance in X
Comp 0 0.000000
Comp 1 70.965644
Comp 2 78.560039
Comp 3 86.147224
Comp 4 95.401016
Comp 5 96.121212
Comp 6 96.968507
Comp 7 97.322372
Comp 8 98.103471
Comp 9 98.321947
Comp 10 98.709784

In [30]:
help(ho.nipalsPLS1.X_cumCalExplVar)


Help on function X_cumCalExplVar in module hoggorm.plsr1:

X_cumCalExplVar(self)
    Returns a list holding the cumulative calibrated explained variance
    for array X after each component.


In [31]:
# Get cumulative calibrated explained variance in Y
Y_cumCalExplVar = model.Y_cumCalExplVar()

# Get cumulative calibrated explained variance in Y and store in pandas dataframe with row and column names
Y_cumCalExplVar_df = pd.DataFrame(model.Y_cumCalExplVar())
Y_cumCalExplVar_df.columns = ['cumulative calibrated explained variance in Y']
Y_cumCalExplVar_df.index = ['Comp {0}'.format(x) for x in range(model.Y_loadings().shape[1] + 1)]
Y_cumCalExplVar_df


Out[31]:
cumulative calibrated explained variance in Y
Comp 0 0.000000
Comp 1 31.903929
Comp 2 94.662359
Comp 3 97.706221
Comp 4 98.009378
Comp 5 98.680062
Comp 6 98.932496
Comp 7 99.062881
Comp 8 99.105879
Comp 9 99.195393
Comp 10 99.242409

In [32]:
help(ho.nipalsPLS1.Y_cumCalExplVar)


Help on function Y_cumCalExplVar in module hoggorm.plsr1:

Y_cumCalExplVar(self)
    Returns a list holding the calibrated explained variance for
    each component. First number represent zero components, second number
    one component, etc.


In [33]:
# Get cumulative calibrated explained variance for each variable in X
X_cumCalExplVar_ind = model.X_cumCalExplVar_indVar()

# Get cumulative calibrated explained variance for each variable in X and store in pandas dataframe with row and column names
X_cumCalExplVar_ind_df = pd.DataFrame(model.X_cumCalExplVar_indVar())
X_cumCalExplVar_ind_df.columns = X_varNames
X_cumCalExplVar_ind_df.index = ['Comp {0}'.format(x) for x in range(model.X_loadings().shape[1] + 1)]
X_cumCalExplVar_ind_df


Out[33]:
0 1 2 3 4 5 6 7 8 9 ... 391 392 393 394 395 396 397 398 399 400
Comp 0 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 ... 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
Comp 1 24.240097 22.985802 25.258067 28.789986 32.023734 34.611605 32.031841 34.524952 34.597667 33.761071 ... 79.239438 77.817650 67.962352 25.837968 3.279380 0.010982 7.791849 4.909695 0.040567 1.285368
Comp 2 25.511034 25.258894 28.380579 34.314390 38.585346 42.349406 41.064193 41.299445 37.500323 33.910389 ... 82.598350 79.237177 68.191847 25.888058 8.183483 5.002835 13.144104 9.105016 0.633741 5.582935
Comp 3 33.074134 33.461040 35.815924 40.502320 45.724896 48.074751 50.602961 46.666995 45.847003 39.975023 ... 84.469433 84.718603 84.057319 54.621346 61.239540 61.964389 75.170550 81.220840 60.138183 77.310195
Comp 4 88.923436 88.557633 89.491000 93.380201 91.566153 91.647704 89.528204 89.570618 91.513537 88.704562 ... 90.307089 87.593934 86.895080 73.028459 80.964050 70.919706 77.545338 82.243548 82.006754 89.438978
Comp 5 89.432504 88.891468 89.657090 93.405524 91.640524 91.858881 90.324558 90.312233 91.984741 89.493828 ... 92.184729 88.417252 87.059095 73.337889 84.153236 77.322255 86.167037 82.311449 82.158466 92.435542
Comp 6 94.127622 93.828156 93.847850 96.289999 95.460724 95.134746 94.385516 93.460903 94.416941 93.495458 ... 92.925688 90.358127 89.266186 73.897996 88.324744 85.318688 96.771037 84.906173 82.512527 92.776951
Comp 7 94.127762 93.919900 93.874162 96.302019 95.476905 95.136602 94.406438 93.460995 94.422766 93.497554 ... 93.562563 90.965832 89.515888 77.770636 88.660831 86.737709 97.228385 90.363961 83.848474 92.779850
Comp 8 94.414103 94.273759 94.265408 96.345197 95.586095 95.332847 94.619826 93.918021 94.613882 93.885166 ... 96.499660 92.074354 93.397105 82.561301 93.517974 96.217036 97.868564 90.633792 85.687090 93.772908
Comp 9 95.040323 94.803860 94.715581 96.644737 95.873042 95.648461 95.110780 94.085459 94.767608 93.987924 ... 96.501018 92.190319 94.216617 85.102512 96.303396 97.502875 98.080311 91.046997 86.905507 95.681681
Comp 10 95.370587 95.166204 95.399360 96.860987 96.314073 96.383588 95.468493 94.819100 95.192511 94.422305 ... 96.603162 92.912346 94.218573 94.933475 96.978844 97.778633 98.081710 94.736023 93.897371 96.575763

11 rows × 401 columns


In [34]:
help(ho.nipalsPLS1.X_cumCalExplVar_indVar)


Help on function X_cumCalExplVar_indVar in module hoggorm.plsr1:

X_cumCalExplVar_indVar(self)
    Returns an array holding the cumulative calibrated explained variance
    for each variable in X after each component. First row represents zero
    components, second row represents one component, third row represents
    two components, etc. Columns represent variables.


In [36]:
# Get calibrated predicted Y for a given number of components

# Predicted Y from calibration using 1 component
Y_from_1_component = model.Y_predCal()[1]

# Predicted Y from calibration using 1 component stored in pandas data frame with row and columns names
Y_from_1_component_df = pd.DataFrame(model.Y_predCal()[1])
Y_from_1_component_df.index = y_objNames
Y_from_1_component_df.columns = y_varNames
Y_from_1_component_df


Out[36]:
0
0 86.911106
1 84.915076
2 85.588714
3 85.416471
4 87.975384
5 86.590302
6 87.742228
7 86.500625
8 87.438444
9 87.772194
10 88.498839
11 87.795651
12 86.497881
13 85.808644
14 84.924761
15 87.063327
16 87.583506
17 87.475696
18 86.359993
19 86.625620
20 87.258882
21 86.920691
22 86.878948
23 87.101215
24 87.016549
25 87.544156
26 87.053244
27 87.231422
28 87.032902
29 87.538890
30 87.309822
31 86.232136
32 86.402475
33 86.193637
34 86.333249
35 87.568666
36 87.068910
37 87.695195
38 86.787085
39 87.520396
40 88.990372
41 88.158184
42 87.620961
43 86.578516
44 88.203435
45 88.198074
46 88.152938
47 87.541284
48 88.275727
49 88.499393
50 87.550895
51 88.134445
52 87.976021
53 86.199121
54 86.922008
55 86.772831
56 87.050422
57 87.649815
58 88.496590
59 87.506037

In [38]:
# Get calibrated predicted Y for a given number of components

# Predicted Y from calibration using 4 component
Y_from_4_component = model.Y_predCal()[4]

# Predicted Y from calibration using 1 component stored in pandas data frame with row and columns names
Y_from_4_component_df = pd.DataFrame(model.Y_predCal()[4])
Y_from_4_component_df.index = y_objNames
Y_from_4_component_df.columns = y_varNames
Y_from_4_component_df


Out[38]:
0
0 85.307228
1 84.955385
2 88.247353
3 83.688355
4 88.516198
5 85.478144
6 88.919243
7 88.465154
8 88.875516
9 88.618707
10 88.313361
11 87.902363
12 87.376918
13 88.112066
14 88.712829
15 85.597926
16 88.164550
17 88.532691
18 85.628950
19 88.313662
20 86.825890
21 87.501976
22 87.028014
23 87.502452
24 86.951810
25 88.600936
26 86.541542
27 86.088715
28 86.436752
29 86.587957
30 86.502362
31 84.435637
32 84.586871
33 84.391630
34 84.362303
35 88.056192
36 85.260570
37 88.122447
38 88.182776
39 88.230685
40 88.731161
41 88.685780
42 88.179312
43 85.237506
44 88.529503
45 88.446611
46 88.205114
47 88.851315
48 88.303868
49 88.562883
50 88.042584
51 87.330023
52 88.312404
53 84.937315
54 85.238764
55 84.487684
56 87.488520
57 86.860206
58 89.235679
59 87.059654

In [39]:
help(ho.nipalsPLS1.X_predCal)


Help on function X_predCal in module hoggorm.plsr1:

X_predCal(self)
    Returns a dictionary holding the predicted arrays Xhat from
    calibration after each computed component. Dictionary key represents
    order of component.


In [40]:
# Get validated explained variance of each component X
X_valExplVar = model.X_valExplVar()

# Get calibrated explained variance in X and store in pandas dataframe with row and column names
X_valExplVar_df = pd.DataFrame(model.X_valExplVar())
X_valExplVar_df.columns = ['validated explained variance in X']
X_valExplVar_df.index = ['Comp {0}'.format(x+1) for x in range(model.X_loadings().shape[1])]
X_valExplVar_df


Out[40]:
validated explained variance in X
Comp 1 70.005018
Comp 2 6.590901
Comp 3 6.777896
Comp 4 11.174024
Comp 5 0.695758
Comp 6 0.784435
Comp 7 0.615478
Comp 8 0.663176
Comp 9 0.158600
Comp 10 0.356236

In [41]:
help(ho.nipalsPLS1.X_valExplVar)


Help on function X_valExplVar in module hoggorm.plsr1:

X_valExplVar(self)
    Returns a list holding the validated explained variance for X after
    each component. First number in list is for component 1, second number
    for component 2, third number for component 3, etc.


In [42]:
# Get validated explained variance of each component Y
Y_valExplVar = model.Y_valExplVar()

# Get calibrated explained variance in X and store in pandas dataframe with row and column names
Y_valExplVar_df = pd.DataFrame(model.Y_valExplVar())
Y_valExplVar_df.columns = ['validated explained variance in Y']
Y_valExplVar_df.index = ['Comp {0}'.format(x+1) for x in range(model.Y_loadings().shape[1])]
Y_valExplVar_df


Out[42]:
validated explained variance in Y
Comp 1 25.906615
Comp 2 67.986391
Comp 3 3.313440
Comp 4 0.350933
Comp 5 -0.000068
Comp 6 0.231423
Comp 7 0.194256
Comp 8 -0.165933
Comp 9 -0.280262
Comp 10 -0.038578

In [43]:
help(ho.nipalsPLS1.Y_valExplVar)


Help on function Y_valExplVar in module hoggorm.plsr1:

Y_valExplVar(self)
    Returns list holding validated explained variance for each component in
    vector y.


In [44]:
# Get cumulative validated explained variance in X
X_cumValExplVar = model.X_cumValExplVar()

# Get cumulative validated explained variance in X and store in pandas dataframe with row and column names
X_cumValExplVar_df = pd.DataFrame(model.X_cumValExplVar())
X_cumValExplVar_df.columns = ['cumulative validated explained variance in X']
X_cumValExplVar_df.index = ['Comp {0}'.format(x) for x in range(model.X_loadings().shape[1] + 1)]
X_cumValExplVar_df


Out[44]:
cumulative validated explained variance in X
Comp 0 0.000000
Comp 1 70.005018
Comp 2 76.595919
Comp 3 83.373815
Comp 4 94.547839
Comp 5 95.243597
Comp 6 96.028032
Comp 7 96.643510
Comp 8 97.306686
Comp 9 97.465286
Comp 10 97.821522

In [45]:
help(ho.nipalsPLS1.X_cumValExplVar)


Help on function X_cumValExplVar in module hoggorm.plsr1:

X_cumValExplVar(self)
    Returns a list holding the cumulative validated explained variance
    for array X after each component. First number represents zero
    components, second number represents component 1, etc.


In [46]:
# Get cumulative validated explained variance in Y
Y_cumValExplVar = model.Y_cumValExplVar()

# Get cumulative validated explained variance in Y and store in pandas dataframe with row and column names
Y_cumValExplVar_df = pd.DataFrame(model.Y_cumValExplVar())
Y_cumValExplVar_df.columns = ['cumulative validated explained variance in Y']
Y_cumValExplVar_df.index = ['Comp {0}'.format(x) for x in range(model.Y_loadings().shape[1] + 1)]
Y_cumValExplVar_df


Out[46]:
cumulative validated explained variance in Y
Comp 0 0.000000
Comp 1 25.906615
Comp 2 93.893006
Comp 3 97.206446
Comp 4 97.557378
Comp 5 97.557310
Comp 6 97.788734
Comp 7 97.982990
Comp 8 97.817057
Comp 9 97.536794
Comp 10 97.498216

In [47]:
help(ho.nipalsPLS1.Y_cumValExplVar)


Help on function Y_cumValExplVar in module hoggorm.plsr1:

Y_cumValExplVar(self)
    Returns list holding cumulative validated explained variance in
    vector y.


In [48]:
help(ho.nipalsPLS1.X_cumValExplVar_indVar)


Help on function X_cumValExplVar_indVar in module hoggorm.plsr1:

X_cumValExplVar_indVar(self)
    Returns an array holding the cumulative validated explained variance
    for each variable in X after each component. First row represents
    zero components, second row represents component 1, third row for
    compnent 2, etc. Columns represent variables.


In [50]:
# Get validated predicted Y for a given number of components

# Predicted Y from validation using 1 component
Y_from_1_component_val = model.Y_predVal()[1]

# Predicted Y from calibration using 1 component stored in pandas data frame with row and columns names
Y_from_1_component_val_df = pd.DataFrame(model.Y_predVal()[1])
Y_from_1_component_val_df.index = y_objNames
Y_from_1_component_val_df.columns = y_varNames
Y_from_1_component_val_df


Out[50]:
0
0 86.961086
1 84.799394
2 85.360086
3 85.562317
4 87.970243
5 86.633192
6 87.706994
7 86.446235
8 87.407556
9 87.751783
10 88.486322
11 87.776305
12 86.475237
13 85.672061
14 84.335593
15 87.109199
16 87.559086
17 87.448141
18 86.388366
19 86.571099
20 87.263892
21 86.916732
22 86.876119
23 87.098383
24 87.020274
25 87.519029
26 87.061913
27 87.255444
28 87.050991
29 87.560862
30 87.328909
31 86.302304
32 86.467681
33 86.255305
34 86.400057
35 87.556554
36 87.113383
37 87.672371
38 86.748813
39 87.495482
40 89.056119
41 88.153773
42 87.597954
43 86.621355
44 88.197323
45 88.190028
46 88.152326
47 87.488923
48 88.263224
49 88.483869
50 87.529054
51 88.154940
52 87.951980
53 86.296864
54 86.999657
55 86.849740
56 87.046918
57 87.679685
58 88.420948
59 87.515158

In [52]:
# Get validated predicted Y for a given number of components

# Predicted Y from validation using 3 components
Y_from_3_component_val = model.Y_predVal()[3]

# Predicted Y from calibration using 3 components stored in pandas data frame with row and columns names
Y_from_3_component_val_df = pd.DataFrame(model.Y_predVal()[3])
Y_from_3_component_val_df.index = y_objNames
Y_from_3_component_val_df.columns = y_varNames
Y_from_3_component_val_df


Out[52]:
0
0 85.225526
1 84.817206
2 88.156217
3 83.672440
4 88.437696
5 85.321255
6 88.819473
7 88.441818
8 88.804853
9 88.529461
10 88.082473
11 87.700380
12 87.316608
13 88.105773
14 88.613540
15 85.507916
16 88.067734
17 88.509267
18 85.710661
19 88.332718
20 86.884617
21 87.386167
22 87.091597
23 87.578450
24 87.022201
25 88.572870
26 86.570678
27 86.138316
28 86.515033
29 86.634382
30 86.559881
31 84.517724
32 84.686033
33 84.435133
34 84.413813
35 88.059026
36 85.253905
37 88.036239
38 88.149563
39 88.177964
40 88.633635
41 88.632028
42 88.042969
43 85.226057
44 88.515674
45 88.648215
46 88.329862
47 89.030301
48 88.385225
49 88.732544
50 88.066707
51 87.281823
52 88.308413
53 84.855800
54 85.340004
55 84.507763
56 87.729642
57 86.952488
58 89.246400
59 87.161736

In [53]:
help(ho.nipalsPLS1.Y_predVal)


Help on function Y_predVal in module hoggorm.plsr1:

Y_predVal(self)
    Returns dictionary holding arrays of predicted yhat after each
    component from validation. Dictionary key represents order of component.


In [54]:
# Get predicted scores for new measurements (objects) of X

# First pretend that we acquired new X data by using part of the existing data and overlaying some noise
import numpy.random as npr
new_X = X[0:4, :] + npr.rand(4, np.shape(X)[1])
np.shape(X)

# Now insert the new data into the existing model and compute scores for two components (numComp=2)
pred_X_scores = model.X_scores_predict(new_X, numComp=2)

# Same as above, but results stored in a pandas dataframe with row names and column names
pred_X_scores_df = pd.DataFrame(model.X_scores_predict(new_X, numComp=2))
pred_X_scores_df.columns = ['Comp {0}'.format(x+1) for x in range(2)]
pred_X_scores_df.index = ['new object {0}'.format(x+1) for x in range(np.shape(new_X)[0])]
pred_X_scores_df


Out[54]:
Comp 1 Comp 2
new object 1 -1.618189 -1.009423
new object 2 -2.039836 -0.667632
new object 3 -2.103844 -0.466693
new object 4 -2.668126 -0.443593

In [55]:
help(ho.nipalsPLS1.X_scores_predict)


Help on function X_scores_predict in module hoggorm.plsr1:

X_scores_predict(self, Xnew, numComp=None)
    Returns array of X scores from new X data using the exsisting model.
    Rows represent objects and columns represent components.


In [56]:
# Predict Y from new X data
pred_Y = model.Y_predict(new_X, numComp=2)

# Predict Y from nex X data and store results in a pandas dataframe with row names and column names
pred_Y_df = pd.DataFrame(model.Y_predict(new_X, numComp=2))
pred_Y_df.columns = y_varNames
pred_Y_df.index = ['new object {0}'.format(x+1) for x in range(np.shape(new_X)[0])]
pred_Y_df


---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-56-5b048cb68030> in <module>
      4 # Predict Y from nex X data and store results in a pandas dataframe with row names and column names
      5 pred_Y_df = pd.DataFrame(model.Y_predict(new_X, numComp=2))
----> 6 pred_Y_df.columns = Y_varNames
      7 pred_Y_df.index = ['new object {0}'.format(x+1) for x in range(np.shape(new_X)[0])]
      8 pred_Y_df

NameError: name 'Y_varNames' is not defined

In [ ]: