02 - Python Basics (10min)

1. Strings

`Strings` are a common data type in programming languages. They are ordered collections of characters, and typically represent human-readable text

  • In Python strings are defined using double quotes or single quotes - it doesn't really matter but they have to match.
  • Strings can be added together (concatenated) with the + operation.
# Assigning a string to a variable
name = "Hello"

# Concatenating and printing strings
message = name + " world!"
print(message)

In [ ]:
# Assigning a string to a variable

# Concatenating and printing strings

It is very common to want to combine strings together into longer text, often including numbers or other values. A widely used approach to string formatting is percentage sign place holders:

  • %s to insert a string
  • %i to insert an integer number
  • %f to insert a floating point number
# Define a string
name = "Leighton"

# Use string formatting to place a string and an integer into another string
message = "Hello %s, your name has %i letters" % (name, len(name))
print(message)
# Define a numerical value
my_value = 3

# Represent the numeric value in two different formats
print("This is an integer: %i" % my_value)
print("This is a floating point (real) number: %f" % my_value)

(This convention was introduced in the C programming language, which was enormously influential in later programming language design, so this is seen in many different languages.)


In [ ]:
# Define a string

# Use string formatting to place a string and an integer into another string

In [ ]:
# Define a numerical value

# Represent the numeric value in two different formats

3. Lists

The Python `list` serves as a general purpose data structure for holding an *ordered* collection of values.

  • This is similar to an array in other languages, and is commonly used in conjunction with a for loop as shown later.
  • You can have a list of strings, a list of integers, etc. - and you can even mix several types of data in the same list.
  • The length of a list is defined as the number of elements in the list.
# Create a list of strings
names = ["Peter", "Sue", "Leighton"]
print(len(names))
# Create a list of several data types
things = ["a name", 3.5, names]
print(things)

In [ ]:
# Create a list of strings

In [ ]:
# Create a list of several data types

4. for loops

`for` loops are a very common programmatic way to iterate over a number of items in a sequence (and then do something with each item)

  • Most programming languages, including Python, have several ways to repeat a block of code multiple times.
  • Python's for loop works with a loop variable (letter in the example below) which takes in turn each of the values to be looped over (here the letters in the string variable message):
# Create a string
message = "Hello world"

# Loop over each letter in the string and print it
for letter in message:
    print(letter)

In [ ]:
# Create a string

# Loop over each letter in the string and print it

Another common situation is to loop over a list of values:

# Loop over a list and print each element
for value in ["alpha", "beta", "gamma", "delta"]:
    print(value)

In [ ]:
# Loop over a list and print each element

Elsewhere in the workshop you'll see this syntax used with other constructs.

5. Defining Functions

Often as your Python code gets longer you will find you repeat snippets of code.

It is usually best to turn the repeated code into a *function* which can be defined once and then used multiple times.

  • This reduces the amount of typing you have to do
  • It also reduces the opportunity for introducing mistakes through typos and other errors
# The Python keyword def is short for 'define'
# Here you are defining a function taking one argument
def make_message(name):
    length = len(name)
    # Python keyword 'return' exits the function with this value:
    return "Hello %s, your name is %i characters long" % (name, length)

In [ ]:
# The Python keyword def is short for 'define'
# Here you are defining a function taking one argument


# Make a message three times
print(make_message("Peter"))
print(make_message("Sue"))
print(make_message("Leighton"))

`for` loops are very important for reducing duplicated code so in this example, rather than calling our function three times, we could do this:

# Assumes you've already executed the cells above which defined
# the list 'names' and the function 'make_message'
for name in names:
    print(make_message(name))


In [ ]:
# Assumes you've already executed the cells above which defined
# the list 'names' and the function 'make_message'
  • The examples we have shown so far are functions taking a single argument, but they can take multiple arguments. Even optional arguments are possible (not shown here).

The example below is a function which requires two arguments, and we combine it with a list to make a rudimentary base frequency calculator:


In [ ]:
# Define a function that returns the frequency with which
# the character 'letter' occurs in the string 'text'
def letter_frequency(text, letter):
    return text.count(letter) / len(text)

# This can be used as a rudimentary base frequency calculator
sequence = "AGTGACACAGGT"
for base in "ACGT":
    print("Frequency of letter %s is %f" % (base, letter_frequency(sequence, base)))

This example also introduced something new for counting the letters in a string. `Python` strings have lots of *methods*, a special kind of `Python` function acting on the the object itself via this `.method(...)` syntax.


In [ ]:
# Examples of some methods of a string
print(message.upper())
print(message.lower())
print(message.count("l"))

6. Resources

We've tried to introduce a minimum of concepts and syntax here.

There are more Python examples throughout these notebooks, some using more advanced things (such as with statements). We don't have time to explain all the components of the language in detail, as we want to focus on the Bioinformatics instead, but some links are provided below as a starting point to learning other cool things about programming.