To the uninitiated the whole pile of Python stuff looks terribly complicated.
To some extent it is.
But there has been a ton of work done to bring order out of the apparent chaos!
In [1]:
from tpot import TPOTClassifier
import pandas as pd
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
import numpy as np
iris = load_iris()
X_train, X_test, y_train, y_test = train_test_split(iris.data.astype(np.float64),
iris.target.astype(np.float64), train_size=0.75, test_size=0.25)
tpot = TPOTClassifier(generations=7, population_size=100, verbosity=2, random_state=2)
tpot.fit(X_train, y_train)
print(tpot.score(X_test, y_test))
tpot.export('tpot_iris_pipeline.py')
In [2]:
proc=pd.DataFrame(tpot.evaluated_individuals_)
proc.head()
Out[2]:
+Initial data explore http://localhost:8889/notebooks/Documents/InfluenceH/Working_copies/Cond_fcast_wkg/ccsProfileInitialanalyis.ipynb
+current model
http://localhost:8889/notebooks/Documents/InfluenceH/Working_copies/Cond_fcast_wkg/WIPNNModelonehottarget2.ipynb#
This notebook on Jupyter hub http://nbviewer.jupyter.org/github/dartdog/ML-lunch/blob/master/ML_resources.ipynb
R vs Python (2 pages ) http://www.kdnuggets.com/2017/06/ecosystem-data-science-machine-learning-software.html
In [2]:
%load_ext watermark
In [3]:
%watermark -a "Tom Brander" -u -n -t -z -v -m -p pandas,numpy,scipy,sklearn,tpot,tensorflow -g
In [8]:
!nvidia-smi
In [10]:
!nvcc --version
In [11]:
!cat /proc/driver/nvidia/version
Best basic book Mainly SciKit Learn https://www.packtpub.com/big-data-and-business-intelligence/python-machine-learning Very useful for all Python ML stuff and algorithims and what to use when and where..
This still early release but the best for Keras (written by the guy who also conceived and wrote the library itself) https://www.manning.com/books/deep-learning-with-python
Best book for TensorFlow http://shop.oreilly.com/product/0636920052289.do also conversely uses SciKit learn, as a method to explain some of the concepts in TF.. Highly recommended..
Most accessible code can be found with Jupyter examples so you want to get that set up on your machine http://jupyter.org/
Easiest way to get everything you need and keep up to date is Anaconda https://www.continuum.io/downloads Includes Jupyter mentioned above as well as Spyder a Python IDE (Win Linux And Mac) Don't even think of doing another way.. (you will thank me!)
up front data exploration CSV kit https://csvkit.readthedocs.io/en/1.0.2/ specifically csvstat (lots more there though) including some god transition and report stuff..
I like https://github.com/JosPolfliet/pandas-profiling
Oh yes now a days just start with python 3.6 not 2.7
In [ ]: