A string are an immutable and indexable sequence of characters in Python. Strings are used in many programming languages to hold text. We have defined strings as a sequence, because each character of a string, in Python, can be accessed individually. Strings can be composed of any character sequence. In this lecture, we will cover:
1. Declaring strings
2. Printing strings
3. Escape Sequences
4. Common string functions and methods
5. Indexing and Slicing strings
6. String properties
7. More on Printing
Assigning a string variable is just like assigning any other variable. You can define a string to the right of the "=" operator by starting with either single quotes 'string here' or double quotes "string here".
In [3]:
# This is a string
"Hello"
Out[3]:
In [4]:
# This is also a string
'Hey, there!'
Out[4]:
In [5]:
# Assigning strings to variables
my_string = "Well, here I am!"
Recall that in the Numbers notebook, we decided that printing was preferred over using the REPL. This is becuase in all code you write, you will never use the REPL alone, as printing is necessary. We print strings by calling the print function.
In [7]:
# Printing a string
print("Hey, there!")
# Printing another string in the same cell, not possible with just the REPL!
print("Hey, there... again!")
You may be wondering if it's possible to print more than one string (or sentence) with a single print function. This is entirely possible with the "escape sequence". An escape sequence is a character set that allows us to print characters that we normally couldn't. For example, how could you create new lines in your escape sequence? Let's try writing an email.
In [13]:
print("To whom it may concern,\n\nI'm currently in a Python Bootcamp right now and I'm starting to hone my skills.\n\nBest,\nMohsin")
We just printed a multi-line message with one print function. Here, we used the newline escape sequence, called with "\n". The \n tells the interpreter to skip a line (aka create a new line). This allowed us to make our text look like an email. We can use two \n in a row, as we did above. This applies to all other escape sequences. There are a few others you should know, including the tab escape sequence, which namely allows you to use tabs in your print functions.
In [16]:
# Tab escape sequence
print("\t<-- Look at that space!")
There are a few more escape sequences. Note that there is no need to memorize them, as you can just look them up and they will become second nature over time as you need them. View them on Microsoft's Developer Network article: https://msdn.microsoft.com/en-us/library/h21280bw.aspx
You might be wondering, what's difference between a function and a method? What is a function? What is a method? Technically, all methods ARE functions. Methods are functions that belong to objects (remember, everything is essentially an object in Python). Functions in Python can either be defined by you, the programmer or are already defined by the Python developers, which are called "Built-in Functions". These are functions you can call impromptu and they'll just work. Here are some examples of built-in functions.
In [18]:
# The built-in float function takes in an integer and returns a float -> float(integer_here)
my_integer = 4
my_float = float(my_integer)
print(my_float)
What we saw above is an example of a Built-in FUNCTION. These come with Python. We can also define our own functions/methods. Below, we are defining a function that tells you if a number is even or odd.
In [24]:
def even_or_odd(some_number):
if some_number % 2 == 0:
return True
So, what kind of function is this? Well, because this function works under a built-in object type in python (the number types), this is called a function. It's not a built-in function, because we made it ourselves. In short, we can classify functions and methods into 4 categories:
1. Built-in functions (https://docs.python.org/3/library/functions.html)
> These are already with Python
2. Programmer created functions (even_or_odd which is above)
> These functions are associated with Python's built-in objects like numbers, strings, etc.
3. Programmer created methods (an example later)
> These methods (which are a tier below functions) are associated with programmer created objects. We haven't created an object yet but when we do, we will defined functions that work with our custom objects. Those functions are known as methods.
Back to the point: strings have a few functions associated with them also. The one you should know is the len() function. It returns the length of a string.
In [29]:
# The len function
len("hello")
Out[29]:
In [30]:
# We can print the result of the len() function
print(len("Hello"))
In [32]:
# Let's store it in variable form
length_of_some_string = len("Hello")
# now print whatever len("Hello") returns
print(length_of_some_string)
String methods are those functions that are associated with a specific string instance, where are as string functions are concerned with strings as a whole. Note: the word instance refers to the objectifying of anything in Python or another language that has objects in it.
In [33]:
# Creating a string instance is just the same thing as assigning a variable a value
c = "Hello"
The "c" variable is now a string instance of the string literal "Hello". The word "string literal" literally means a word or words inside quotes, so the actual string itself.
Now that we have a string object/instance, let's call methods on it specifically. Methods are called as follows:
object.method(arguments)
The string has a few methods. Methods that we'll look at will be the upper(), lower(), split(), and format().
In [34]:
my_str = "python is great"
In [35]:
# uppercase a string with upper() method
print(my_str.upper())
In [40]:
# lowercase string with lower() method
print(my_str.lower())
Below, we use the my_str.split() method on my_str. The split method takes each contigous sequence, or word, instead of an overall string and puts each words inside of a data structure called a list. A list is a mutable, indexable sequence of objects in Python. We'll talk about it more later, but, for now, imagine it just holds objects (basically anything) together in a bundle.
In [41]:
# split a string
lst = my_str.split()
print(lst)
Watch out for the format() function in the printing section at the bottom.
Strings, as we defined before, are indexable. Effectively, this means you can access the individual characters that make up a string. How do we do this? We begin by using the bracket notation and specify an index between 0 to the length of the string - 1. Let n be the length of the string, then we can access a string as such:
my_string_here[0]
my_string_here[1]
.
.
.
my_string_here[n-1]
Let's try declaring some strings and finding the start, end, middle, etc.
In [44]:
# Declare a string
my_str = "crabs"
In [45]:
# Find the beginning of a string is simply index 0
print(my_str[0])
It printed "c", we expected this, becuase 0 is like the number 1 in our everyday lives. Computer systems start with 0. How would we get the last character of a string? We could just count the words in our string and subtract 1.
Thought process: the length of "crabs" is 5. We know that the last character is at 5-1 = 4 because we start with the number 0 in indexing. So, theoretically, my_str[4] should return the last character of "crabs", which woiuld be 's'.
In [46]:
print(my_str[4])
We did it! But, imagine you were working with user inputted strings that you did not know of. How would you get the length of a string whose actual contents you don't know? Simple! Python indexing works backwards, too! You can specify the my_str[-1] index to get the string before the first (0th), which would be the last. It works in circularly.
In [47]:
print(my_str[-1])
In [48]:
# For fun, let's print each character
print(my_str[0])
print(my_str[1])
print(my_str[2])
print(my_str[3])
print(my_str[4])
What if we wanted more than just one character? How about the first character all the way to the second to last character? This process is called slicing. We can slice strings to get ranges of their characters. Let's get the first to second last.
In general, slicing works like so:
my_str[start:end]
where "start" is inclusive, and "end" is exclusive.
Thought process: to get the first to second last, where the "end" of a slicing is not included, would simply mean let start be 0, and end be the length of the string minus 1 (becuase it's the character we don't want.
In [49]:
# Get the first to second to last character.
my_str[0:len(my_str)-1]
Out[49]:
What just happened? Remember that we said that slicing works like so:
my_str[start(inclusive):end(exclusive)]
Thus, if we want the very first string, we let start be 0. For the end, we put in the index of the character we don't want included up until. Meaning that if we don't want the last character, we put the indice of the character as end. That indice that represents the last character is n-1, where n represents length of the string. We reduce to this formula:
my_str[0:lengthofstring - 1]
How do we get the length? Recall: the len() built-in function. We call the function to get the length, and subtract by 1 to get
my_str[0:len(my_str)-1]
which yields a satisfying "crab" instead of "crabs".
There's a shortcut for getting from any start to the definite end of a string (ex: get all characters starting from 3rd to the very end, get all characters from the 2nd character to then very end). Simply leave out the end argument and fill in the start with an indice you want to start and include.
my_str[start:]
In [51]:
# go from the 2nd character to the end
print(my_str[2:])
Who thought you'd get abs from this coding exercise?
Lastly in slicing, we also have the skip argument. Here's the fully string indexing format:
my_str[start:end:skip]
How would you get every other character, going to the end? Well, we would start at 0 to get the first, we would let end be empty because we want to go all the way anyways, and let skip be 2 to skip every other character.
In [53]:
#Get every other character of my_str
print(my_str[0::2])
How would we reverse a string? We know that the start would be the end, the end would be the start. By default, our skip moves forward by "1". We found out earlier could change this to "-1" to go backwards. If we can automatically go backwards, do we need a start or and end if the indexing will stop itself? No! Consider:
my_str[::-1]
to reverse a string.
In [54]:
# Reverse a string
my_str[::-1]
Out[54]:
What properties to strings have? Well, we said earlier strings are indexable. We already talked about that. We also said that strings were "immutable". To be immutable means to not be able to be changed. A string literal within a variable cannot be altered by indexing. Take a look:
In [56]:
my_str[2] = "e"
Note how the interpreter returned: TypeError: 'str' object does not support item assignment
String's cannot be physically changed this way. You cannot pick an indice and change it's character. Look, it doesn't work:
In [58]:
my_str[1] = "ee"
However, we can do two things to strings: concatenation and physical reassignment That's right, while you can't single handedly change a character of your string instance, you can concatenate characters to it or reassign it to whatever string you want. The reason this is allowed is because strings can be physically reassigned into new sequences (words), but not changed on the spot.
In [60]:
# Reassigning a string to whatever we want
my_str = "black labs"
In [61]:
# Reassign again
my_str = "parrots"
We can also concatenate, which means we can add append (add to the end) characters.
In [63]:
# adding to our parrots string
my_str = my_str + " that can talk"
print(my_str)
When we call a print function, you only notice that the arguments are a single object type. Meaning, we haven't somehow print two seperate variables inside of a print function. However, it is possible.
In [70]:
x = 5
In [71]:
print("I have %d cats" % x)
This is called using the the "str %" construct. This construct allows us to substitute the %letter with any variable want. For integers, it's %d, %f for float, and %s for string.
In [73]:
my_str = "hello"
print("%s, it's me" % my_str)
In [76]:
pi_val = 3.140000
print("I think pi is around %f" % pi_val)
In [85]:
# You can add multiple arguments
x = 5
y = "hello"
print("%s it's me, I was wondering how you've been these past %d months" % (y, x))
You should be familiar with this syntax because many programmers still use it. However, in the modern age, many people have started using the format() method. It does the same thing, but in a more organized fashion. The syntax is the following:
my_str.format(var1, var2)
Below is an example where you want to use a pre-defined variable.
In [1]:
x =5
# Format an integer into a string
print("Hey there you have {} apples".format(x))
Below is an example where you define the variables at run-time of the print function
The format is similar:
my_str.format(var1=somevalue, var2=....)
In [2]:
print("I've got {x} problems, and programming in python ain't {my_str}".format(x=99, my_str="one"))