by Scott Josephson
Driving while distracted, fatigued or drowsy may lead to accidents. Activities that divert the driver's attention from the road ahead, such as engaging in a conversation with other passengers in the car, making or receiving phone calls, sending or receiving text messages, eating while driving or events outside the car may cause driver distraction. Fatigue and drowsiness can result from driving long hours or from lack of sleep.
The data for this Kaggle challenge shows the results of a number of "trials", each one representing about 2 minutes of sequential data that are recorded every 100 ms during a driving session on the road or in a driving simulator. The trials are samples from some 100 drivers of both genders, and of different ages and ethnic backgrounds. The files are structured as follows:
The first column is the Trial ID - each period of around 2 minutes of sequential data has a unique trial ID. For instance, the first 1210 observations represent sequential observations every 100ms, and therefore all have the same trial ID The second column is the observation number - this is a sequentially increasing number within one trial ID The third column has a value X for each row where
X = 1 if the driver is alert
X = 0 if the driver is not alert
The next 8 columns with headers P1, P2 , …….., P8 represent physiological data;
The next 11 columns with headers E1, E2, …….., E11 represent environmental data;
The next 11 columns with headers V1, V2, …….., V11 represent vehicular data;
In [1]:
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, classification_report
Split the data into training set and testing set using train_test_split
In [2]:
ford_train = pd.read_csv('fordtrain.csv')
Check the head of ad_data
In [3]:
ford_train.head()
Out[3]:
In [4]:
ford_train.info()
In [19]:
X_train, X_test, y_train, y_test = train_test_split(ford_train.drop('IsAlert',axis=1),ford_train['IsAlert'],
test_size=0.30,random_state=101)
Train and fit a logistic regression model on the training set.
In [21]:
logmodel = LogisticRegression()
In [22]:
logmodel.fit(X_train, y_train)
Out[22]:
In [23]:
predictions = logmodel.predict(X_test)
Create a classification report for the model.
In [25]:
print(classification_report(y_test,predictions))