Intro to Data Science: Final Project 1

Analyzing the NYC Subway Dataset

Section 4. Conclusion

Import Data


In [393]:
import operator

import numpy as np
import pandas as pd
import scipy as sp
import scipy.stats as st
import statsmodels.api as sm
import scipy.optimize as op

from sklearn import linear_model
from sklearn.metrics import r2_score
from sklearn.linear_model import Ridge
from sklearn.linear_model import SGDClassifier
from sklearn.svm import SVC

import matplotlib.pyplot as plt
%matplotlib inline

filename = '/Users/excalibur/py/nanodegree/intro_ds/final_project/improved-dataset/turnstile_weather_v2.csv'

# import data
data = pd.read_csv(filename)

Functions for Basic Statistics and Learning

Extract Relevant Data

Class for Creating Data Samples

Formulas Implemented (i.e., not included in modules/packages)

Class for Creating Learners

4.1 From your analysis and interpretation of the data, do more people ride the NYC subway when it is raining or when it is not raining?

4.2 What analyses lead you to this conclusion?