Deep Learning

Keras


In [1]:
library(keras)

In [2]:
df <- read.csv('evasao.csv')

Separando dados de treino e teste:


In [3]:
n <- nrow(df)
set.seed(42) # Se não fizer isso, verá valores diferentes dos meus
limite <- sample(1:n, size = round(0.75*n), replace = FALSE)
train_df <- df[limite,]
test_df <- df[-limite,]

In [4]:
head(test_df)


periodobolsarepetiuematrasodisciplinasfaltasdesempenhoabandonou
3 4 0.10 0 1 1 0 8.0000000
4 4 0.20 8 1 1 0 4.0000001
7 9 0.10 6 1 1 1 2.0000000
9 9 0.15 7 1 5 10 2.8000000
11 6 0.25 5 1 3 6 2.6666670
1710 0.05 0 0 2 0 9.0000000

Vamos criar o modelo Sequencial:


In [5]:
modelo <- keras_model_sequential() 
modelo %>% 
  layer_dense(units = 128, activation = 'relu', input_shape=c(7), kernel_initializer = "normal") %>% 
  layer_dropout(rate = 0.4) %>% 
  layer_dense(units = 512, activation='relu') %>%
  layer_dropout(rate = 0.3) %>%
  layer_dense(units=2,activation='sigmoid')

In [6]:
summary(modelo)


________________________________________________________________________________
Layer (type)                        Output Shape                    Param #     
================================================================================
dense_1 (Dense)                     (None, 128)                     1024        
________________________________________________________________________________
dropout_1 (Dropout)                 (None, 128)                     0           
________________________________________________________________________________
dense_2 (Dense)                     (None, 512)                     66048       
________________________________________________________________________________
dropout_2 (Dropout)                 (None, 512)                     0           
________________________________________________________________________________
dense_3 (Dense)                     (None, 2)                       1026        
================================================================================
Total params: 68,098
Trainable params: 68,098
Non-trainable params: 0
________________________________________________________________________________

Compilando e treinando o modelo:


In [7]:
modelo %>% compile(
  loss = 'categorical_crossentropy',
  optimizer = optimizer_rmsprop(),
  metrics = 'accuracy'
)

In [8]:
library(dplyr)
preditores_treino <- data.matrix(select(train_df, c('periodo','bolsa','repetiu','ematraso','disciplinas','faltas','desempenho')))


Attaching package: ‘dplyr’

The following objects are masked from ‘package:stats’:

    filter, lag

The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union


In [9]:
head(preditores_treino)


periodobolsarepetiuematrasodisciplinasfaltasdesempenho
275 2 0.150 1 1 0 10.0
281 9 0.002 1 5 0 0.8
86 9 0.103 0 1 2 6.0
247 8 0.158 0 0 0 0.0
190 9 0.256 0 3 0 1.0
15410 0.203 1 2 2 5.0

In [10]:
rotulos_treino <- to_categorical(data.matrix(select(train_df,c('abandonou'))))
head(rotulos_treino)


10
10
10
10
01
10

In [11]:
historico <- modelo %>% fit(
  preditores_treino, rotulos_treino, 
  epochs = 30, batch_size = 128, 
  validation_split = 0.2
)

In [12]:
plot(historico)


Avaliando o modelo:


In [14]:
rotulos_teste <- to_categorical(data.matrix(select(test_df,c('abandonou'))))
head(rotulos_teste)
preditores_teste <- data.matrix(select(test_df, c('periodo','bolsa','repetiu','ematraso','disciplinas','faltas','desempenho')))
modelo %>% evaluate(preditores_teste, rotulos_teste)


10
01
10
10
10
10
$loss
0.714963535467784
$acc
0.653333331743876

In [15]:
modelo


Model
________________________________________________________________________________
Layer (type)                        Output Shape                    Param #     
================================================================================
dense_1 (Dense)                     (None, 128)                     1024        
________________________________________________________________________________
dropout_1 (Dropout)                 (None, 128)                     0           
________________________________________________________________________________
dense_2 (Dense)                     (None, 512)                     66048       
________________________________________________________________________________
dropout_2 (Dropout)                 (None, 512)                     0           
________________________________________________________________________________
dense_3 (Dense)                     (None, 2)                       1026        
================================================================================
Total params: 68,098
Trainable params: 68,098
Non-trainable params: 0
________________________________________________________________________________


In [19]:
resultado <- modelo %>% predict_classes(preditores_teste)

In [24]:
str(resultado)


 num [1:75(1d)] 0 1 1 1 1 0 1 1 1 1 ...

In [25]:
test_df$abandonou


  1. 0
  2. 1
  3. 0
  4. 0
  5. 0
  6. 0
  7. 0
  8. 1
  9. 1
  10. 0
  11. 1
  12. 1
  13. 1
  14. 0
  15. 1
  16. 0
  17. 0
  18. 1
  19. 0
  20. 1
  21. 0
  22. 1
  23. 0
  24. 1
  25. 0
  26. 0
  27. 1
  28. 0
  29. 1
  30. 0
  31. 1
  32. 1
  33. 1
  34. 0
  35. 1
  36. 0
  37. 1
  38. 0
  39. 0
  40. 0
  41. 0
  42. 0
  43. 0
  44. 0
  45. 1
  46. 1
  47. 0
  48. 0
  49. 0
  50. 0
  51. 1
  52. 0
  53. 0
  54. 0
  55. 0
  56. 1
  57. 1
  58. 1
  59. 0
  60. 1
  61. 1
  62. 1
  63. 0
  64. 0
  65. 0
  66. 0
  67. 0
  68. 0
  69. 0
  70. 0
  71. 1
  72. 0
  73. 0
  74. 1
  75. 1

In [26]:
resultado


  1. 0
  2. 1
  3. 1
  4. 1
  5. 1
  6. 0
  7. 1
  8. 1
  9. 1
  10. 1
  11. 0
  12. 1
  13. 1
  14. 1
  15. 1
  16. 0
  17. 0
  18. 1
  19. 1
  20. 1
  21. 0
  22. 1
  23. 1
  24. 1
  25. 0
  26. 0
  27. 1
  28. 0
  29. 0
  30. 0
  31. 1
  32. 1
  33. 1
  34. 0
  35. 0
  36. 0
  37. 0
  38. 0
  39. 1
  40. 0
  41. 1
  42. 1
  43. 1
  44. 0
  45. 1
  46. 0
  47. 1
  48. 0
  49. 0
  50. 0
  51. 1
  52. 0
  53. 1
  54. 0
  55. 1
  56. 1
  57. 1
  58. 1
  59. 1
  60. 1
  61. 1
  62. 0
  63. 0
  64. 0
  65. 1
  66. 0
  67. 1
  68. 1
  69. 0
  70. 1
  71. 1
  72. 0
  73. 0
  74. 1
  75. 1

In [ ]: