Instructions
For fast processing, you can just change the following variables before running:
make sure you don't use a name already used or the file will be replaced
With the previous steps in mind, just click on Cell menu and select Run All
In [1]:
# Imports
import numpy as np
np.seterr(all='ignore')
import pandas as pd
from decimal import Decimal
import time
# Import python script with Pysegreg functions
from segregationMetrics import Segreg
# Instantiate segreg as cc
cc = Segreg()
In [2]:
cc.readAttributesFile('/Users/sandrofsousa/Downloads/valid/Segreg sample.csv')
Out[2]:
Compute Population Intensity
For non spatial result, please comment the function call at: "cc.locality= ..."
Distance matrix is calculated at this step. Change the parameters for the population
intensity according to your needs. Parameters are:
In [3]:
start_time = time.time()
cc.locality = cc.cal_localityMatrix(bandwidth=700, weightmethod=1)
print("--- %s seconds for processing ---" % (time.time() - start_time))
For validation only
Remove the comment (#) if you want to see the values and validate
In [4]:
# np.set_printoptions(threshold=np.inf)
# print('Location (coordinates from data):\n', cc.location)
# print()
# print('Population intensity for all groups:\n', cc.locality)
'''To select locality for a specific line (validation), use the index in[x,:]'''
# where x is the number of the desired line
# cc.locality[5,:]
Out[4]:
Compute local Dissimilarity
In [5]:
diss_local = cc.cal_localDissimilarity()
diss_local = np.asmatrix(diss_local).transpose()
Compute global Dissimilarity
In [6]:
diss_global = cc.cal_globalDissimilarity()
Compute local Exposure/Isolation
expo is a matrix of n_group * n_group therefore, exposure (m,n) = rs[m,n]
the columns are exporsure m1 to n1, to n2... n5, m2 to n1....n5
Result of all combinations of local groups expousure/isolation
To select a specific line of m to n, use the index [x]
Each value is a result of the combinations m,n
e.g.: g1xg1, g1xg2, g2,g1, g2xg2 = isolation, expousure, // , isolation
In [7]:
expo_local = cc.cal_localExposure()
Compute global Exposure/Isolation
In [18]:
expo_global = cc.cal_globalExposure()
Compute local Entropy
In [19]:
entro_local = cc.cal_localEntropy()
Compute global Entropy
In [20]:
entro_global = cc.cal_globalEntropy()
Compute local Index H
In [21]:
idxh_local = cc.cal_localIndexH()
Compute global Index H
In [22]:
idxh_global = cc.cal_globalIndexH()
In [23]:
# Concatenate local values from measures
if len(cc.locality) == 0:
results = np.concatenate((expo_local, diss_local, entro_local, idxh_local), axis=1)
else:
results = np.concatenate((cc.locality, expo_local, diss_local, entro_local, idxh_local), axis=1)
# Concatenate the results with original data
output = np.concatenate((cc.tract_id, cc.attributeMatrix, results),axis = 1)
In [24]:
names = ['id','x','y']
for i in range(cc.n_group):
names.append('group_'+str(i))
if len(cc.locality) == 0:
for i in range(cc.n_group):
for j in range(cc.n_group):
if i == j:
names.append('iso_' + str(i) + str(j))
else:
names.append('exp_' + str(i) + str(j))
names.append('dissimil')
names.append('entropy')
names.append('indexh')
else:
for i in range(cc.n_group):
names.append('intens_'+str(i))
for i in range(cc.n_group):
for j in range(cc.n_group):
if i == j:
names.append('iso_' + str(i) + str(j))
else:
names.append('exp_' + str(i) + str(j))
names.append('dissimil')
names.append('entropy')
names.append('indexh')
Save Local and global results to a file
The paramenter fname corresponds to the folder/filename, change it as you want.
To save on a diferent folder, use the "/" to pass the directory.
The local results will be saved using the name defined and adding the "_local" postfix to file's name.
The global results are automatically saved using the same name with the addiction of the postfix "_globals".
It's recommended to save on a different folder from the code, e.g.: a folder named result.
The fname value should be changed for any new executions or the local file will be overwrited!
In [34]:
fname = "/Users/sandrofsousa/Downloads/valid/result"
output = pd.DataFrame(output, columns=names)
output.to_csv("%s_local.csv" % fname, sep=",", index=False)
with open("%s_global.txt" % fname, "w") as f:
f.write('Global dissimilarity: ' + str(diss_global))
f.write('\nGlobal entropy: ' + str(entro_global))
f.write('\nGlobal Index H: ' + str(idxh_global))
f.write('\nGlobal isolation/exposure: \n')
f.write(str(expo_global))
In [85]:
# code to save data as a continuous string - Marcus request for R use
# names2 = ['dissimil', 'entropy', 'indexh']
# for i in range(cc.n_group):
# for j in range(cc.n_group):
# if i == j:
# names2.append('iso_' + str(i) + str(j))
# else:
# names2.append('exp_' + str(i) + str(j))
# values = [diss_global, entro_global, idxh_global]
# for i in expo_global: values.append(i)
# file2 = "/Users/sandrofsousa/Downloads/"
# with open("%s_global.csv" % file2, "w") as f:
# f.write(', '.join(names2) + '\n')
# f.write(', '.join(str(i) for i in values))