A string is a sequence

  • A string is a sequence of characters
  • You can access individual characters in the sequence using the bracket operator

In [1]:
fruit = 'banana'
letter = fruit[1]
print( letter )


a
  • Why was an a returned instead of b?
  • Computer scientists (and much of Europe) count starting with 0
  • These sequence are referred to as zero-based
  • The bracket notation only accepts integers (or expressions that result in an integer)

The len operator

  • len is a built-in function that returns the number of characters in a string

In [2]:
len( fruit )


Out[2]:
6
  • Note that the lenght is one-based, just like normal counting
  • What happens when you try to access a value that is out of range?

In [3]:
length = len( fruit )
# Uncomment to see what happens
#last = fruit[ length ]
  • Remember that indices are zero-based, while counting is one-based
  • There are multiple ways to get the last letter

In [4]:
last = fruit[ length - 1 ]
last = fruit[ -1 ]
  • You can also use other negative integers to index characters

Traversal with a for loop

  • There is frequently a need to process each item in a sequence one at a time
  • This is done by traversing the sequence and processing each item in turn
  • One way to do this is using a while loop

In [5]:
index = 0
while( index < len( fruit ) ):
    letter = fruit[ index ]
    print( letter )
    index = index + 1


b
a
n
a
n
a
  • A for loop can also be used
  • We can either use the for loop to keep track of the index or have Python automatically access each value in the sequence

In [6]:
# Use the for loop to keep track of the index
for index in range( len( fruit ) ):
    print( fruit[ index ] )

# Have Python keep track of things for us
for char in fruit:
    print( char )


b
a
n
a
n
a
b
a
n
a
n
a
  • The second approach can be simpler, but you don't know the index of the value
  • Loops can also be used to build strings

In [7]:
prefixes = 'JKLMNOPQ'
suffix = 'ack'

for letter in prefixes:
    print( letter + suffix )


Jack
Kack
Lack
Mack
Nack
Oack
Pack
Qack

String slices

  • A segment of a string is called a slice
  • The method of selecting a slice is similar to accessing a single character

In [8]:
s = 'Monty Python'
print( s[0:5] )
print( s[6:12] )


Monty
Python
  • The operator [n:m] returns a portion of the string
  • The n-th item is included, but the m-th item is not
  • TODO insert image
  • If you omit an index from the slice, it either uses the beginning or the end of the string depending on which index is omitted

In [9]:
fruit = 'banana'
print( fruit[:3] )
print( fruit[3:] )


ban
ana
  • If the first index is greater than or equal to the second, an empty string is returned

Strings are immutable

  • Strings are immutable
  • This means you can't change a string once it has been built

In [10]:
greeting = 'Hello world!'
# Uncomment to see the error generated
# greeting[0] = 'J'
  • In the error, the object is the string and the item is the character
  • Don't worry about the idea of an object right now
  • We will discuss it later
  • For now, think of it as a variable's value
  • Since strings are immutable, the only way to "change" one is to create a new one with the changes you want

In [11]:
greeting = 'Hello world!'
new_greeting = 'J' + greeting[1:]
print( new_greeting )
print( greeting )


Jello world!
Hello world!
  • Note that the original string is unchanged

Searching

  • In many situations, you need to search for a particular value in a sequence of values
  • The easiest way to do it is using a sequential search

In [12]:
def find( word, letter ):
    letter_index = -1
    current_index = 0
    while( current_index < len( word ) ):
        if( word[current_index] == letter ):
            letter_index = current_index
    return letter_index
  • Notice that if the letter isn't found in the word, the function returns -1
  • It is common to return a special value indicating failure of some kind
  • However, it needs to be noted in the comments so it is actually useful

String methods

  • A method is similar to a function
  • It takes arguments and can return a value, but the syntax to use it is different
  • It operates on a variable that is an object (like a string)
  • All the TurtleWorld code used methods on a turtle

In [13]:
word = 'banana'
new_word = word.upper()
print( new_word )


BANANA
  • In this case, the method upper is called on the string word
  • It returns a new string with all the letters now in uppercase
  • Calling a method like this is called an invocation
  • Strings in Python have a number of methods already built in
  • One of them is find
  • It finds th eindex of a specified substring (not just a character)
  • Python's documentation lists them all
    https://docs.python.org/3/library/stdtypes.html#string-methods

The in operator

  • Strings have an in operator that returns True if the first string is a substring of the second

In [14]:
print( 'an' in 'banana' )
print( 'seed' in 'banana' )


True
False

String comparison

  • The relational equality operator == works on strings
  • The other relational operators work as well

In [17]:
word = 'apple'

if( word == 'banana' ):
    print( 'All right, bananas.' )

if( word < 'banana' ):
    print( 'Your word, ' + word + ', comes before banana.' )
elif( word > 'banana' ):
    print( 'Your word, ' + word + ', comes after banana.' )
else:
    print( 'All right, bananas.' )


Your word, apple, comes before banana.
  • As we have previously discussed, uppercase letters come before lowercase letters with respect to encoding
  • This means that uppercase are considered less than lowercase letters
  • If you need to compare strings, it is common to convert them to all lowercase (or uppercase) before comparing unless you specifically want to compare using case

Debugging

  • When you traverse a sequence of items, it is sometimes challenging to get the beginning and ending indices correct
  • These are frequently referred to as off-by-one errors since you stop one early or one late
  • To aid in debugging, it is helpful to display not only the index, but the value at the index as well

In [16]:
word = 'banana'
index = 2
print( 'index=[', index, '] value=[', word[ index ], ']' )


index=[ 2 ] value=[ n ]

Exercises

  • Write a function is_palindrome that takes a string phrase as an argument and returns a boolean indicating whether or not the string is a palindrome.

In [ ]:
def is_palindrome( phrase ):
    # YOUR CODE GOES HERE
    return False
  • As discussed above, Python has an operator called in that determines if a character or a substring is contained in a larger string. It returns True if the character or substring is in the larger string, otherwise, it returns False. Write a function called in_string that implements the same functionality.

In [ ]:
def in_string( substring, larger_string ):
    # YOUR CODE GOES HERE
    return False
  • Write a function that prompts a user to enter their first and last names. The function should then print a computer username consisting of the first letter of their first name and the first seven letters of their last name.

In [ ]:
def create_username():
    username = ''
    # YOUR CODE HERE
    print( username )