Example on the use of correspondence tables

In this simple example it is shown how a vector classified according to one classification is converted into another classification

The first classification has four categories: A, B, C, D
The second classification has three categories: 1, 2, 3



In [30]:

    
import numpy as np
import pandas as pd

Let's create an arbitrary classification matrix (CM)



In [59]:

    
CM=pd.DataFrame.from_items([('A', [1, 0, 0]), ('B', [0, 1, 0]),('C', [0, 0, 1]),('D', [0, 0, 1])], 
                           orient='index', columns=['1', '2', '3'])
display (CM)

Notice that moving from the first classification to the second one is possible since the 'totals' of rows are all equal to 1 (see below the other way around)



In [54]:

    
CM_tot2=CM.sum(axis=1)
C2=CM
C2['total']=CM_tot2
display (C2)

Let's create an arbitrary vector classified according to the first classification



In [56]:

    
V1 = np.random.randint(0, 10, size=4).reshape(4, 1)
Class_A = [_ for _ in 'ABCD']
V1_A = pd.DataFrame(V1, index=Class_A, columns = ['amount'])
display (V1_A)

This vector is converted into the second classification



In [60]:

    
V1_A_transp=pd.DataFrame.transpose(V1_A)
V1_B= pd.DataFrame((np.dot(V1_A_transp, CM)), index=['amount'], columns = ['1','2','3'])
display (V1_B)

Moving from second classifcation to the second one may cause problems, since the "totals" of columns is not always 1.



In [61]:

    
sum_row = {col: CM[col].sum() for col in CM}
#sum_row =CM.sum()
sum_rCM = pd.DataFrame(sum_row, index=["Total"])
CM_tot = CM.append(sum_rCM)
display (CM_tot)

In this case moving from a the second classification to the first one will duplicate the value in the category "3".

Here an example of correspondence table



In [ ]: