Python fundamentals

A quick introduction to the Python programming language and Jupyter notebooks. (We're using Python 3, not Python 2.)

Basic data types and the print() function


In [ ]:
# variable assignment
# https://www.digitalocean.com/community/tutorials/how-to-use-variables-in-python-3

# strings -- enclose in single or double quotes, just make sure they match


# numbers



# the print function



# booleans

Basic math

You can do basic math with Python. (You can also do more advanced math.)


In [ ]:
# addition


# subtraction


# multiplication


# division


# etc.

Lists

A comma-separated collection of items between square brackets: []. Python keeps track of the order of things inside a list.


In [ ]:
# create a list: name, hometown, age
# an item's position in the list is the key thing


# create another list of mixed data


# use len() to get the number of items in the list




# use square brackets [] to access items in a list
# (counting starts at zero in Python)

# get the first item



# you can do negative indexing to get items from the end of your list

# get the last item



# Use colons to get a range of items in a list

# get the first two items
# the last number in a list slice is the first list item that's ~not~ included in the result



# if you leave the last number off, it takes the item at the first number's index and everything afterward
# get everything from the third item onward



# Use append() to add things to a list



# Use pop() to remove items from the end of a list



# use join() to join items from a list into a string with a delimiter of your choosing

Dictionaries

A data structure that maps keys to values inside curly brackets: {}. Items in the dictionary are separated by commas. Python does not keep track of the order of items in a dictionary; if you need to keep track of insertion order, use an OrderedDict instead.


In [ ]:
# Access items in a dictionary using square brackets and the key (typically a string)



# You can also use the `get()` method to retrieve values
# you can optionally provide a second argument as the default value
# if the key doesn't exist (otherwise defaults to `None`)



# Use the .keys() method to get the keys of a dictionary


# Use the .values() method to get the values


# add items to a dictionary using square brackets, the name of the key (typically a string)
# and set the value like you'd set a variable, with =



# delete an item from a dictionary with `del`

Commenting your code

Python skips lines that begin with a hashtag # -- these lines are used to write comments to help explain the code to others (and to your future self).

Multi-line comments are enclosed between triple quotes: """ """


In [ ]:

Comparison operators

When you want to compare values, you can use these symbols:

  • < means less than
  • > means greater than
  • == means equal
  • >= means greater than or equal
  • <= means less than or equal
  • != means not equal

In [ ]:

String functions

Python has a number of built-in methods to work with strings. They're useful if, say, you're using Python to clean data. Here are a few of them:

strip()

Call strip() on a string to remove whitespace from either side. It's like using the =TRIM() function in Excel.


In [ ]:

upper() and lower()

Call .upper() on a string to make the characters uppercase. Call .lower() on a string to make the characters lowercase. This can be useful when testing strings for equality.


In [ ]:

replace()

Use .replace() to substitute bits of text.


In [ ]:

split()

Use .split() to split a string on some delimiter. If you don't specify a delimiter, it uses a single space as the default.


In [ ]:

zfill()

Among other things, you can use .zfill() to add zero padding -- for instance, if you're working with ZIP code data that was saved as a number somewhere and you've lost the leading zeroes for that handful of ZIP codes that begin with 0.

Note: .zfill() is a string method, so if you want to apply it to a number, you'll need to first coerce it to a string with str().


In [ ]:

slicing

Like lists, strings are iterables, so you can use slicing to grab chunks.


In [ ]:

startswith(), endswith() and in

If you need to test whether a string starts with a series of characters, use .startswith(). If you need to test whether a string ends with a series of characters, use .endswith(). If you need to test whether a string is part of another string -- or a list of strings -- use .in().

These are case sensitive, so you'd typically .upper() or .lower() the strings you're comparing to ensure an apples-to-apples comparison.


In [ ]:

String formatting

Using curly brackets with the various options available to the .format() method, you can create string templates for your data. Some examples:


In [ ]:
# date in m/d/yyyy format


# split out individual pieces of the date
# using a shortcut method to assign variables to the resulting list


# reshuffle as yyyy-mm-dd using .format()
# use a formatting option (:0>2) to left-pad month/day numbers with a zero

In [ ]:
# construct a greeting template

Type coercion

Consider:

# this is a number, can't do string-y things to it
age = 32

# this is a string, can't do number-y things to it
age = '32'

There are several functions you can use to coerce a value of one type to a value of another type. Here are a couple of them:

  • int() tries to convert to an integer
  • str() tries to convert to a string
  • float() tries to convert to a float

In [ ]:
# two strings of numbers


# what happens when you add them without coercing?


# coerce to integer, then add them