Overview of Python Data Analysis Tools

&

Getting Started With Python Data Analysis



Generally

1) Ask questions! I may not be able to give you an answer off the top of my head, but I can probably make something up.

2) Join our internal Python Users Group and download Anaconda3 from the HP Service Manager if you haven't already.

3) Check the hyperlinks. We've got limited time and there are some really good resources (from people smarter than me).

4) Don't worry if you don't understand the code. You're not supposed to.

5) The "Additional Learning Materials" at the bottom of each section are just the things that I can vouch for. If you type in "How to use X" on YouTube or Google, you can easily find more.

Goals

a) Two sessions separate sessions. The idea is to burn through notebooks 1 through 9 as quickly as possible, and then do the exercises as we have time.

b) The first of these sessions is intended to give you the high level view as to what's out there with respect to Python data analysis tools. For the purposes of demonstration, we will be using Anaconda3 instead of a PowerPoint deck, which is availible in the bank free of cost.

c) The second of these sessions is intended to give you a more hands on approach to how you might tackle a particular problem or set of problems.


Python Data Science Tools Overview


1) Introduction

  • #### (You are here.)

2) Jupyter

  • #### A HTML-based GUI for interacting with IPython.

3) Python

  • #### The foundational language on which the stack is built.

4) Matplotlib

  • #### Graphing toolkit tied into the Jupyter Notebook.

5) Numpy

  • #### Low-level C library to speed things up.

6) Scipy

  • #### Scientific stuff for people smarter than me.

7) Pandas

  • #### The Python data science workhorse.

8) Scikit-Learn

  • #### Machine learning, model validation, and associated packages.

9) Other

  • #### Database connectors, email, and other useful tidbits.

Getting Started With Python Data Science Tools


10) Python Basics

  • #### Getting started with Python.

11) Pandas Basics

  • #### Series, DataFrames, etc.

12) Pandas Selection and Slicing

  • #### How to sort, select, and slice as needed.

13) Pandas Munging

  • #### Cleaning and prepping your data for analysis.

14) Pandas Aggregation

  • #### Manipulating, and aggregating data.

15) Pandas Strings and Time Series

  • #### Namespaces for specialized operations.

16) Pandas Apply and Map

  • #### Using custom functions across the dataframe.

17) Matplotlib Basics

  • #### The basics of plotting.

18) Scikit Basics

  • #### Machine learning for people with deadlines.