Data Science is not single step process
Data Science Goal:
NumPy is the fundamental package for scientific computing with Python.
a powerful N-dimensional array object sophisticated (broadcasting) functions tools for integrating C/C++ and Fortran code useful linear algebra, Fourier transform, and random number capabilities Besides its obvious scientific uses, NumPy can also be used as an efficient multi-dimensional container of generic data. Arbitrary data-types can be defined. This allows NumPy to seamlessly and speedily integrate with a wide variety of databases.
In [4]:
import numpy as np
In [5]:
import pandas as pd
In [427]:
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
# Example
plt.plot([1, 2, 4, 3, 2, 1, 3], [5, 4, 5, 6, 6, 5, 6], 'r')
Out[427]:
In [2]:
# can represent any kind of line
from sklearn import linear_model
# can represent any kind of curves
from sklearn.svm import LinearSVR
# can represent any kinds of curves
from sklearn.ensemble import RandomForestRegressor
In [ ]:
from sklearn.metrics.regression import mean_squared_error, mean_absolute_error
Data Analytics & Feature Engineering (Seaborn, PCA)
Test-Train Data Split (Test Train Validation splits)
Model Understanding & working intutions (Model Tuning)
Scoring Methods (Truth Tables, Precision, Recall)
Some other time for these models
Basic of Sketching
There are only 3 major shapes in sketching - Lines, Curves and Oval shapes
Three basic principles in which SVM focuses
Optimal Seperation/Boundary Region
Find a linear equation, that could represent our solution using Hyper-planes
Maximun Marginal boundary distances.
Seperation of regions such that we have enough safety space space in both regions.
Kernal Transformations
The kernel function can be any of the following: