Nesse arquivo utilizamos uma abordagem mais complexa. Diante de cada caminho possuimos diversas variáveis (4 no total, cada variável representa um roteador no momento em que o usuario captura a informação).

E fica a pergunta de:

Como podemos melhor utilizar esses dados?

Aqui apresentamos uma estratégia de estrair informações de uma séries temporais através de Wavelet. Simplementes tentamos extrair o máximo de informações que possam mostrar a similaridade ou disimilaridade de cada série. Para isso usamos a decomposição da série, extraindo mais detalhes da série.

Aqui usamos o método DWT(Discrete Wavelet Transforms) com um filtro haar (são os filtro mais simples, básico degrais unitários, utilizados também em processamento de imagem como no algoritmo de viola jones para detecção de faces), mas como já havia dito, isso é apenas uma das diversas estratégias possíveis. Poderiamos usar DFT(acredito que DFT não é uma boa escolha desde que DWT trabalha melhor que DFT(Discrete Fourier Transform) em diversos artigos usando bases temporais (som, video, etc...) como é o nosso caso) ou outros filtros e até mesmo criar novos filtros.


In [12]:
library(wavelets)
library(caret)
library(kernlab)
library(pROC)

print("Target")
groups <- read.csv(file="./MovementAAL/groups/MovementAAL_DatasetGroup.csv",head=TRUE,sep=",")
targetAll <- read.csv(file="./MovementAAL/dataset/MovementAAL_target.csv",head=TRUE,sep=",")
head(targetAll)


[1] "Target"
Out[12]:
X.sequence_IDclass_label
111
221
331
441
551
661

In [2]:
#Group 1
allDataGroup1<-list()
allDataGroup1Target<-list()
groups1 = groups[groups$dataset_ID==1, ]

index<-1
for (id in groups1$X.sequence_ID){
    caminho <-paste("./MovementAAL/dataset/MovementAAL_RSS_",id,".csv",sep="")
    allDataGroup1[[index]]<-read.csv(file=caminho,head=TRUE,sep=",")
    allDataGroup1Target[index]<-targetAll[[2]][id]
    index<-index+1
}
wtData <- NULL
minStepsBack = 17
for (i in 1:length(allDataGroup1)){
     aMatrix <- data.matrix(allDataGroup1[[i]], rownames.force = NA)
     wt <- dwt(aMatrix[1:minStepsBack,], filter="haar", boundary="periodic")
     wtData <- rbind(wtData, unlist(c(wt@W,wt@V[[wt@level]])))
}
wtData <- as.data.frame(wtData)
data = unlist(allDataGroup1Target)
target = factor(data,labels=c("No","Yes"))
frameDataFinal <- data.frame(cbind(target, wtData))
tail(frameDataFinal)


Out[2]:
targetW11W12W13W14W15W16W17W18W19ellip.hW37W38W41W42W43W44W51W52W6V67
99No0.1010173-0.235707-0.10101730.1346890.10101020.13468970.067344850.37038960.006731657<8b>-0.088388350.1414214-0.3004702-0.3395225-0.0791650.26250.1111164-0.2390728-3.645356-3.041544
100No0.1010173-0.3703903-0.30304484.98733e-180.10101024.98733e-180.3030448-0.033671720.1979899<8b>-0.2651650.05303301-0.06095275-0.1728575-0.3958275-0.6-0.4427866-0.09638396-3.447086-2.766606
101No1.303753e-17-0.33671790.20203450.1010173-0.067344856.396793e-18-0.47139980.033672420.3676955<8b>-0.265165-0.1414214-0.4019-0.049047-0.1583325-0.2375-0.9936526-0.2689534-3.505893-2.23875
102No-1.637145e-17-0.20202750.10101020.033672420.20203454.911436e-17-0.30305180.033673840.2788051<8b>-0.6187184-0.3712311-0.0038105-0.4528575-0.10654250.375-0.60710210.07029172-4.076132-1.323275
103No-0.23570066.396793e-180.03367242-0.067344850.1010173-0.10101730.16835580.30304480.4363839<8b>-0.1414214-0.08838835-0.01523525-0.1733325-0.18452750.3625-0.226944-0.3202999-4.378155-1.443873
104No0.16835510.60608959.730714e-184.98733e-186.396793e-18-1.637145e-176.396793e-180.10101730.7555931<8b>-0.12374370.212132-0.0447625-1.259047-0.37976-0.71250.21752370.4541499-3.905772-1.211492

Média e Desvio padrão respectivamente.

Group 1, com Cross Validation tipo 10-fold

In [3]:
inTraining <- createDataPartition(frameDataFinal$target, p = .7, list = TRUE,times=10)
allAccuracyGroup1 <- c()

for( i in 1:length(inTraining)){

    training <- frameDataFinal[ inTraining[[i]],]
    testing  <- frameDataFinal[-inTraining[[i]],]
    fitControl <- trainControl(method = "none", classProbs = TRUE)

    svmLinearFit <- train(target ~ ., data = training,
                     method = "svmLinear",
                     trControl = fitControl,
                     family=binomial)
    preds<- predict(svmLinearFit, newdata = testing)
    matrix <- confusionMatrix(preds,frameDataFinal$target[-inTraining[[i]]])
    allAccuracyGroup1 <- c(allAccuracyGroup1,matrix[3]$overall[[1]])
}

mean(allAccuracyGroup1)
sd(allAccuracyGroup1)


Out[3]:
0.686666666666667
Out[3]:
0.0723503137604432

In [4]:
#Group 2
allDataGroup2<-list()
allDataGroup2Target<-list()
groups2 = groups[groups$dataset_ID==2, ]

index<-1
for (id in groups2$X.sequence_ID){
    caminho <-paste("./MovementAAL/dataset/MovementAAL_RSS_",id,".csv",sep="")
    allDataGroup2[[index]]<-read.csv(file=caminho,head=TRUE,sep=",")
    allDataGroup2Target[index]<-targetAll[[2]][id]
    index<-index+1
}
wtData <- NULL
minStepsBack = 17
for (i in 1:length(allDataGroup2)){
     aMatrix <- data.matrix(allDataGroup2[[i]], rownames.force = NA)
     wt <- dwt(aMatrix[1:minStepsBack,], filter="haar", boundary="periodic")
     wtData <- rbind(wtData, unlist(c(wt@W,wt@V[[wt@level]])))
}
wtData <- as.data.frame(wtData)
data = unlist(allDataGroup2Target)
target = factor(data,labels=c("No","Yes"))
frameDataFinal <- data.frame(cbind(target, wtData))
tail(frameDataFinal)


Out[4]:
targetW11W12W13W14W15W16W17W18W19ellip.hW37W38W41W42W43W44W51W52W6V67
101No-1.147899e-170.0314309-0.1571403-1.645277e-17-9.215718e-19-9.215718e-19-0.37712130.1257094-0.5656854<8b>-0.050508640.0505051-0.5760650.59382250.01623375-0.3095275-0.87344130.80429033.13925-1.514803
102No-9.215718e-19-0.03143093.225501e-180.1257094-0.06285472-1.629014e-17-1.629014e-170.09427855-0.2151585<8b>0.21887080.06734485-0.3820550.4557120.0827910.083335-0.8472151.0595063.500972-1.92793
103No2.209062e-18-0.219988-3.252607e-182.607506e-17-0.094278550.09428562-0.1571403-0.12570240.3239398<8b>-0.70710680.2525432-0.006836750.5011672-0.08117025-0.345235-0.42283290.83949293.484434-0.5050059
104No-2.209062e-18-2.209062e-180.0314309-1.629014e-170.1885571-0.09427855-0.18856426.505213e-180.261085<8b>-0.1010137-0.10101370.0606850.055654750.3165595-0.523805-0.27431871.5263263.828612-0.08240287
105No-1.065567e-18-1.065567e-180.3142665-3.293264e-181.065567e-181.065567e-181.631724e-17-0.094278550.3722846<8b>0.08417399-0.06734485-1.232483-0.10838720.0091985-0.535715-0.35223231.8186572.775723-0.2762629
106No-1.132991e-170.1571319-2.209062e-183.252607e-183.293264e-18-2.209062e-182.209062e-18-0.31426650.04351535<8b>0.1178464-0.03366535-1.016239-0.15500830.09469650.250005-0.59250231.5049033.0702840.1109921

Média e Desvio padrão respectivamente.

Group 2, com Cross Validation tipo 10-fold

In [5]:
inTraining <- createDataPartition(frameDataFinal$target, p = .7, list = TRUE,times=10)
allAccuracyGroup2 <- c()

for( i in 1:length(inTraining)){

    training <- frameDataFinal[ inTraining[[i]],]
    testing  <- frameDataFinal[-inTraining[[i]],]
    fitControl <- trainControl(method = "none", classProbs = TRUE)

    svmLinearFit <- train(target ~ ., data = training,
                     method = "svmLinear",
                     trControl = fitControl,
                     family=binomial)
    preds<- predict(svmLinearFit, newdata = testing)
    matrix <- confusionMatrix(preds,frameDataFinal$target[-inTraining[[i]]])
    allAccuracyGroup2 <- c(allAccuracyGroup2,matrix[3]$overall[[1]])
}

mean(allAccuracyGroup2)
sd(allAccuracyGroup2)


Out[5]:
0.625806451612903
Out[5]:
0.0793079093100463

In [6]:
#Group 3
allDataGroup3<-list()
allDataGroup3Target<-list()
groups3 = groups[groups$dataset_ID==3, ]

index<-1
for (id in groups3$X.sequence_ID){
    caminho <-paste("./MovementAAL/dataset/MovementAAL_RSS_",id,".csv",sep="")
    allDataGroup3[[index]]<-read.csv(file=caminho,head=TRUE,sep=",")
    allDataGroup3Target[index]<-targetAll[[2]][id]
    index<-index+1
}
wtData <- NULL
minStepsBack = 17
for (i in 1:length(allDataGroup3)){
     aMatrix <- data.matrix(allDataGroup3[[i]], rownames.force = NA)
     wt <- dwt(aMatrix[1:minStepsBack,], filter="haar", boundary="periodic")
     wtData <- rbind(wtData, unlist(c(wt@W,wt@V[[wt@level]])))
}
wtData <- as.data.frame(wtData)
data = unlist(allDataGroup3Target)
target = factor(data,labels=c("No","Yes"))
frameDataFinal <- data.frame(cbind(target, wtData))
tail(frameDataFinal)


Out[6]:
targetW11W12W13W14W15W16W17W18W19ellip.hW37W38W41W42W43W44W51W52W6V67
99No-2.482823e-17-0.032145072.168404e-19-0.09642815-0.0321382.797242e-170.16070422.526191e-170.05642712<8b>0.5867392-0.1203566-0.07525250.6333325-0.03451375-0.053182750.050353070.35455963.351799-1.363856
100No1.398621e-17-0.1928492-0.096421082.797242e-172.797242e-172.797242e-172.797242e-1700.3771213<8b>-0.4513441-0.2557641-0.0401550.24445-0.1059110.6276590.6915681-0.037605713.11207-2.399295
101No0.12856621.398621e-172.168404e-19-0.25713230.1928492-0.12856622.526191e-172.526191e-17-0.2935554<8b>0.1053094-0.2858479-0.28863750.02222250.76217520.085110.10766760.4593653.98536-0.1669201
102No0.096428152.482823e-170.1285662-5.319367e-180.064283082.526191e-172.526191e-170.2571323-0.1335654<8b>-0.045138160.3159388-0.52752751.25e-060.5158377-0.1914850.020179940.90402743.547129-0.570297
103No2.168404e-192.168404e-19-0.16070422.753874e-17-8.348357e-182.526191e-172.526191e-172.482823e-170.1207102<8b>-0.060178320.03008386-0.07626250.1444450.4423137-0.0425550.062145851.093423.537199-0.5729086
104No0.1928422-0.06428308-1.669671e-17-1.669671e-17-1.669671e-172.753874e-172.753874e-17-0.09642108-0.1285662<8b>0.78232530.1955822-0.628530.51110750.42151050.5851050.39373651.0471173.329233-0.9164797

Média e Desvio padrão respectivamente.

Group 3, com Cross Validation tipo 10-fold

In [7]:
inTraining <- createDataPartition(frameDataFinal$target, p = .7, list = TRUE,times=10)
allAccuracyGroup3 <- c()

for( i in 1:length(inTraining)){

    training <- frameDataFinal[ inTraining[[i]],]
    testing  <- frameDataFinal[-inTraining[[i]],]
    fitControl <- trainControl(method = "none", classProbs = TRUE)

    svmLinearFit <- train(target ~ ., data = training,
                     method = "svmLinear",
                     trControl = fitControl,
                     family=binomial)
    preds<- predict(svmLinearFit, newdata = testing)
    matrix <- confusionMatrix(preds,frameDataFinal$target[-inTraining[[i]]])
    allAccuracyGroup3 <- c(allAccuracyGroup3,matrix[3]$overall[[1]])
}

mean(allAccuracyGroup3)
sd(allAccuracyGroup3)


Out[7]:
0.480645161290323
Out[7]:
0.0635228193021158

In [8]:
#All Groups
allData<-list()
allDataTarget<-list()
targetAll <- read.csv(file="./MovementAAL/dataset/MovementAAL_target.csv",head=TRUE,sep=",")

index<-1
for (id in targetAll$X.sequence_ID){
    caminho <-paste("./MovementAAL/dataset/MovementAAL_RSS_",id,".csv",sep="")
    allData[[index]]<-read.csv(file=caminho,head=TRUE,sep=",")
    allDataTarget[index]<-targetAll[[2]][id]
    index<-index+1
}
wtData <- NULL
minStepsBack = 17
for (i in 1:length(allData)){
     aMatrix <- data.matrix(allData[[i]], rownames.force = NA)
     wt <- dwt(aMatrix[1:minStepsBack,], filter="haar", boundary="periodic")
     wtData <- rbind(wtData, unlist(c(wt@W,wt@V[[wt@level]])))
}
wtData <- as.data.frame(wtData)
data = unlist(allDataTarget)
target = factor(data,labels=c("No","Yes"))
frameDataFinal <- data.frame(cbind(target, wtData))
tail(frameDataFinal)


Out[8]:
targetW11W12W13W14W15W16W17W18W19ellip.hW37W38W41W42W43W44W51W52W6V67
309No-2.482823e-17-0.032145072.168404e-19-0.09642815-0.0321382.797242e-170.16070422.526191e-170.05642712<8b>0.5867392-0.1203566-0.07525250.6333325-0.03451375-0.053182750.050353070.35455963.351799-1.363856
310No1.398621e-17-0.1928492-0.096421082.797242e-172.797242e-172.797242e-172.797242e-1700.3771213<8b>-0.4513441-0.2557641-0.0401550.24445-0.1059110.6276590.6915681-0.037605713.11207-2.399295
311No0.12856621.398621e-172.168404e-19-0.25713230.1928492-0.12856622.526191e-172.526191e-17-0.2935554<8b>0.1053094-0.2858479-0.28863750.02222250.76217520.085110.10766760.4593653.98536-0.1669201
312No0.096428152.482823e-170.1285662-5.319367e-180.064283082.526191e-172.526191e-170.2571323-0.1335654<8b>-0.045138160.3159388-0.52752751.25e-060.5158377-0.1914850.020179940.90402743.547129-0.570297
313No2.168404e-192.168404e-19-0.16070422.753874e-17-8.348357e-182.526191e-172.526191e-172.482823e-170.1207102<8b>-0.060178320.03008386-0.07626250.1444450.4423137-0.0425550.062145851.093423.537199-0.5729086
314No0.1928422-0.06428308-1.669671e-17-1.669671e-17-1.669671e-172.753874e-172.753874e-17-0.09642108-0.1285662<8b>0.78232530.1955822-0.628530.51110750.42151050.5851050.39373651.0471173.329233-0.9164797

Média e Desvio padrão respectivamente.

Todos os Groups em uma base apenas, com Cross Validation tipo 10-fold

In [9]:
inTraining <- createDataPartition(frameDataFinal$target, p = .7, list = TRUE,times=10)
allAccuracy <- c()

for( i in 1:length(inTraining)){

    training <- frameDataFinal[ inTraining[[i]],]
    testing  <- frameDataFinal[-inTraining[[i]],]
    fitControl <- trainControl(method = "none", classProbs = TRUE)

    svmLinearFit <- train(target ~ ., data = training,
                     method = "svmLinear",
                     trControl = fitControl,
                     family=binomial)
    preds<- predict(svmLinearFit, newdata = testing)
    matrix <- confusionMatrix(preds,frameDataFinal$target[-inTraining[[i]]])
    allAccuracy <- c(allAccuracy,matrix[3]$overall[[1]])
}

mean(allAccuracy)
sd(allAccuracy)


Out[9]:
0.617204301075269
Out[9]:
0.0453939503913622

Matrix de confusão

Todos os Groups em uma base apenas


In [10]:
#All groups datasets Confusion Matrix 
inTraining <- createDataPartition(frameDataFinal$target, p = .7, list = TRUE,times=1)
training <- frameDataFinal[ inTraining[[1]],]
testing  <- frameDataFinal[-inTraining[[1]],]
fitControl <- trainControl(method = "none", classProbs = TRUE)

svmLinearFit <- train(target ~ ., data = training,
                     method = "svmLinear",
                     trControl = fitControl,
                     family=binomial)
preds<- predict(svmLinearFit, newdata = testing)
matrix <- confusionMatrix(preds,frameDataFinal$target[-inTraining[[1]]])
matrix


Out[10]:
Confusion Matrix and Statistics

          Reference
Prediction No Yes
       No  22  12
       Yes 24  35
                                          
               Accuracy : 0.6129          
                 95% CI : (0.5062, 0.7122)
    No Information Rate : 0.5054          
    P-Value [Acc > NIR] : 0.02405         
                                          
                  Kappa : 0.2236          
 Mcnemar's Test P-Value : 0.06675         
                                          
            Sensitivity : 0.4783          
            Specificity : 0.7447          
         Pos Pred Value : 0.6471          
         Neg Pred Value : 0.5932          
             Prevalence : 0.4946          
         Detection Rate : 0.2366          
   Detection Prevalence : 0.3656          
      Balanced Accuracy : 0.6115          
                                          
       'Positive' Class : No              
                                          

Curva ROC e AUC

Todos os Groups em uma base apenas


In [11]:
#ROC CURVE AND AUC
predsProb<- predict(svmLinearFit, newdata = testing,type="prob")
outcome<- predsProb[,2]
classes <- frameDataFinal$target[-inTraining[[1]]]
rocobj <- roc(classes, outcome,levels=c("No","Yes"))
plot(rocobj)


Out[11]:
Call:
roc.default(response = classes, predictor = outcome, levels = c("No",     "Yes"))

Data: outcome in 46 controls (classes No) < 47 cases (classes Yes).
Area under the curve: 0.7072