Piscataway Machine Learning Meetup

Introduction to Python 01

Python is a scripting language which is very easy to learn. The syntax is very easy to grasp, which makes it very popular for programming novices. Thanks to the SciPy project, it is also very easy to solve machine learning problems with it!

Data Types

All programming languages classify data into types. Understanding the differences between these data types will make programming MUCH easier.

Integer: Just like in arithmetic, integers are pretty much any number without a decimal point.
Float: Short for "floating-point number". Floats are pretty much any number WITH a decimal point.
String: Short for "a string of characters". Strings are any kind of text.
Boolean: Named after the mathematician George Boole, Boolean values are either True or False.
List: This is a list of values. Lists may only contain one type of data (you can have a list of integers, but not a list with a mix of integers and strings).
Tuple: This is a collection of values. It's like a list, but it can contain different data types, and it cannot be altered once created.
None: This signifies a missing value. It's equivalent to NULL in databases.



In [1]:

    
# COMMENTS begin with a pound sign (#) and extend to the end of the line.
# Comments are ignored by the computer. They are used to explain what the code is supposed to do.
# It is best practice to use LOTS of comments. 
# That way, when you look back at your code, you can more quickly understand what you meant to do.

# The type function tells us the data type of whatever value is passed to it.
# The print function displays the output of whatever value is passed to it.
# When used together, they will display the data type of whatever value is inside all of those parentheses.
# We'll talk about functions in just a little bit.

# Integer
print(type(0))

# Float
print(type(1.2))

# String
print(type("Hello!"))

# Boolean
print(type(True))

# List
print(type([1,2,3,4]))

# Tuple
print(type((1,"b",False)))

# None
print(type(None))









    



<class 'int'>
<class 'float'>
<class 'str'>
<class 'bool'>
<class 'list'>
<class 'tuple'>
<class 'NoneType'>

Variables and assignment

Variables are containers that can hold any values. As the name suggests, the value that they hold can change. When you make the variable hold a value, we say that you assign a value to the variable.

Unlike in many other programming languages, variables are "untyped", meaning that you are allowed to assign different data types to the variable over the course of your program. Below, we assign an integer to myVariable, then assign a string to it.



In [4]:

    
myVariable = 4
print(myVariable)
# Can you take the code from the previous cell and show what the type of "myVariable" is?


myVariable = 5
print(myVariable)
# Let's check the data type of myVariable again


myVariable = "Hello!"
print(myVariable)
# Please check the data type of myVariable one more time


# To run the code you've just written, press the "Play" button on the toolbar.
# If we were not allowed to assign these values to myVariable, we would have seen an error below.









    



4
5
Hello!

Functions

Functions are pieces of code that are "saved" and can be re-run whenever you need to. They help to organize your code so that you don't have to re-write the same thing over and over again. They can also be used for some advanced techniques such as recursion. You can create a function using the def keyword. You may also see the lambda keyword used to create small one-line functions.

When you use a function in your code, we say that you call the function. To call a function, place a pair of parentheses after the function name. Functions can take values as input. Those values are called parameters, or arguments. When you provide a value to the function, we say that you pass a value to the function. In general, functions are only allowed to work with data that are passed in. To pass a parameter into a function, place it in the parentheses. There are rules surrounding what values you can pass and when, but it's easiest to learn these rules by imitating other peoples' code.



In [11]:

    
# Earlier in the tutorial, we covered the print function. Print the String "Hello, world!"


# We also covered the type function. Find the type of the String "Hello, world!"


# Daisy-chaining functions in that manner is extremely useful when working with data.

# We can also define our own functions. Here, I've written a simple 3-number added

def ThreeNumberAdder (n1, n2, n3):
    return n1 + n2 + n3

# And now I can test it out:
print(ThreeNumberAdder(1, 2, 3))



In [12]:

    
# Copy that function below, and see if you can modify it to multiply 3 numbers instead.



# You can test your ThreeNumberMultiplier below here.

Classes, modules, and packages

In Python, you are allowed to create new data types. You can do this by creating a class. We won't actually create any classes, but just be aware that we will be using classes that other people have written. That means that we will be using more data types than just the integers, strings, booleans, etc. that we covered earlier.

Packages are large collections of code which are designed to add features to a programming language. Packages are made up of smaller components caled modules. We will be using 6 main packages in our exploration of data science: NumPy, Pandas, SciPy, MatPlotLib, SciKit-Learn, and TensorFlow. These packages are all interrelated, so it can be hard to figure out which ones you need. Don't worry too much about the details right now. As you gain more experience, you will understand when to use what package. To use a package or module, you need to import it.



In [10]:

    
# Usually we put these imports at the top of the file, so that it's easy to tell what packages are required

import numpy as np                # This imports the entire package and lets us refer to it as np.
import matplotlib.pyplot as plt   # This imports just pyplot from matplotlib and calls it plt.

# Most tutorials will use these default import statements, so it's a good idea to always use these aliases.

# Now try importing the pandas package and calling it pd.



In [ ]: