T.N.Olsthoorn, Feb 27, 2017
Last week we focused on the essential tools in Python, i.e tuples, lists, dicts and sets. Strings are another such tool, but strings become essentially effective if combined with reading files and lists, dicts and set. That's whey we postponed this subject to this week, allowing us more handling space.
We've already mentioned that strings are immutable sequences of characters. So we can't replace individual characters or change them from uppercase to lowercase, nor can't we append characters, but we can replace an old string by a new one, and while doing so apply the required changes. So immutability is never a big issue, on the contrary immutability allows using strings as keys in dicts, which is a game changing advantage.
Let's see some strings:
In [172]:
from pprint import pprint
In [82]:
s1 = 'This is a string'
s2 ="This too is a ; the `quotes` don't matter as long as your are consequent you can use quotes inside quotes"
s3 = """This is a multiline
string, mostly used for doc
strings in fucntions and classes
"""
print(s1)
print(s2)
print()
print(s3)
Escape character \\ (backslash) is used to interprete special characters that otherwise cannot be printed like
the newline \\n and the tab \\tab. There are many more, but these two are most important.
To prevent the bacslash from being interpreted as the prelude of a special character use double backslash.
This is often necessary when typing or using strings that represent dirctories in Windows.
For example C:\users\system\python\IHEcourse
In [190]:
print("This prints with tabs \t\t and newlines\n\n")
print("A windows directory: C\\users\\system\\python\\IHEcourse")
If you don't want the \\ to be intepreted by python you should use "raw" strings, which you get by putting the lower case letter r in front if the string.
In [191]:
print(r"This prints with tabs \t\t and newlines\n\n")
print(r"A windows directory: C\\users\\system\\python\\IHEcourse")
String addition with + is concatenation and multiplication with * means multiple concatenation:
In [19]:
"This is " + 'a `so-called` concatenated' + ' ' + "string" + (', ha') *3 + '!'
Out[19]:
Or bind the string with a variable, like a and print it:
In [20]:
s = "This is " + 'a `so-called` concatenated' + ' ' + "string" + (', ha') *3 + '!'
print(s)
Stings can contain replacement fields { } and then be formatted with the method format(...).
The the values in myList are placed in the replacement field { }. But before doing this, a string is built like so
" {}" * len(myList)
to get the string with the required number of such fields.
In [13]:
myList = [3.0, 2.279345, 1.9823, -3.4, 1e3]
(" {}," * len(myList)).format(*myList)
Out[13]:
Notice that format() is a method of the string class. It's intensively used in print statements for put values in the strings to be printed. There are many many options to manage the way values are shown after they have been placed in these fields. It's said that format() has a mini-language. You'll get aquited with it, but it is very usefull to read the documentation, or to be at least of aware of it.
Some examples for using the replacement fields:
In [72]:
from math import pi, e
print("Just using the replacement fields:\n{0}, {1}, {1}, {1}, {2}, {2}, {2}\n".format(2, pi, e))
print("The number in them indixes the number of the parameters in the format list.\n")
print("Just using the replacement fields, with a different order of the printed variables:\n\
{2}, {0}, {2}, {1}, {0}, {2}, {1}\n".format(2, pi, e))
print("You don't need the variable number specifier if you use the order of the variables in the format")
print("{}, {}, {}, {}\n".format(pi, e, 314, pi/e))
print("Using d, f, e, and g format specifiers:\n{0:d}, {1:.2f}, {1:.4f}, {1:.2e}, {2:.5e}, {2:.3g}, {2:.5g}\n".format(2, pi, e))
print("Using d, f, e, and g format specifiers with field width:\n\
{0:5d}, {1:8.2f}, {1:8.4f}, {1:10.2e}, {2:10.5e}, {2:10.3g}, {2:10.5g}\n".format(2, pi, e))
print('d format is integer (whole number), with field width specified\n\
{0:d}, {0:4d}, {0:10d}\n'.format(314))
print('f format is floating point with field width and decimals specified\n\
{0:10.0f}, {0:10.2f}, {0:10.6f}\n'.format(pi))
print('e format is floating scientific form with field width and decimals specified\n\
{0:10.0e}, {0:10.2e}, {0:10.6e}\n'.format(pi))
print('g format is floating general form with field width and significant digits specified\n\
{0:10.0g}, {0:10.2g}, {0:10.6g}\n'.format(pi))
print('You can combine alingment within the specified field width\n\
{0:>10.0g}, {0:<10.2g}, {0:<10.6g}\n'.format(pi))
print('Pad integers with leading zeros:\n\
{0:4d}, {0:04d}, {0:10d}, {0:010d}\n'.format(314))
print('You don\'t even need the `d` when printing integers:\n\
{0:4}, {0:04}, {0:10}, {0:010}\n'.format(314))
print('The most general replacement is with strings, using s-format:\n\
{0:s}, {0:10s}, {0:<10s}, {0:>10s}\n\
you may also here drop the letter s of the format:\n\
{0}, {0:10}, {0:<10}, {0:>10}'.format('Hello!'))
Just one more compound example of usign +, * and replacement.
First construct the string:
In [34]:
s1 = "ok, according to {}, this is '" + s[8:20] + s[34:41] + "'?"
print(s1.format('you'))
Use th slicing with a negative step size get a reversed copy of the sting
In [75]:
print(s[::-1])
the string class has a number of useful and importand methods associated with it, which can be inspected in the notebook by typing a dot immediately after the string and pressing the
In [84]:
s1.upper?
In [86]:
# dir(s1) to see all the attributes of the s1 (in fact of the class str)
[k for k in dir(s1) if not k.startswith('_')] # use this comprehension to see only the public ones
Out[86]:
Some examples:
In [188]:
s = "This is a string with Upper and lower case characters"
print("=" * 80)
print("Below the results of applying about all methods of str:\n")
print(s.capitalize()) # make first character uppercase and the rest lower case
print(s.casefold()) # returns s suitable for caseless comparisons
print(s.center(70, '#')) # Return S centered in a string of length width. Padding is
print('How ofthen does the letter `a` occur in s? ', s.count('a')) # number of times given character is found in s
print('Does s end with `ters` ? ', s.endswith('ters'))
print("Does s start with 'This is a` ?", s.startswith('This is a'))
print("\tThis\tis\ta\tstring\twith\ttabs\tinstead\tof\ta\tspace.\t\t") # Return a copy of S where all tab characters are expanded using spaces
print('The word `Upper` is found at position {}'.format(s.find('Upper'))) # Return the lowest index in S where substring sub is found
print('The word `Upper` is found at position {}'.format(s.index('Upper', 10, -1))) # Like S.find() but raise ValueError when the substring is not found
print(s.split(' ')) # splits a at specified character, space in this case. Yield list of string.
print('^'.join(s.split(' '))) # joins list of string putting the specifed character between the words.
print(s.lower()) # lower case copy of s
print(s.upper()) # upper ase copy of s
print(s.replace('characters', 'alphanumeric symbols'))
print(s.title(), ' (All words now capitalized)') # returns string with all words capitalized
print(s.title().swapcase(), ' (Case firt titled and then swapped)') # returns string with all words capitalized
s2 = " \t string with\twhitespace\t "
print("'" + s2 + "'", "(String with whitespace, (with tabs and spaces))")
print("'" + s2.strip() + "'", "(Left and right whitespace removed)") # whitespace removed
print("'" + s2.lstrip() + "'", '(Left whitespace removed)') # left whitespace removed
print("'" + s2.rstrip() + "'", "(Right whitespace removed)") # right withspace removed
print("'" + s2.ljust(40, '=') + "'", "(str is left justified and padded with '=')")
print("'" + s2.rjust(40, '+') + "'", "(str is right justified and padded with '+')")
s2 = " 'a a a a a"
print('First index of `{}` in `{}` is {}'.format('a', s2, s2.find('a')))
print('Last index of `{}` in `{}` is {}'.format('a', s2, s2.rfind('a')))
s3 = 'This/is/the/day and That\\was\\yesterday'
print(s3.partition('/')) # Search for the separator sep in S, and return the part before it,
print('First part is `{}`, separator is `{}` and last part is `{}`'.format(*s3.partition('/')))
print(s3.rpartition('\\')) # Search for the separator sep in S, and return the part before it,
print('First part is `{}`, separator is `{}` and last part is `{}`'.format(*s3.rpartition('\\')))
# Format_map is like format but can use values from a dict if the keys are used in the replacement fields
print()
pprint("Using format_map, replace keys in {} by values from dict:\n")
horse={'name' : 'duke',
'age' : 2,
'color': 'brown',
'likes' : 'hay'}
print()
pprint(horse, width=40)
print()
print('My {color} horse named {name} is {age} years old and especially likes {likes} on Sundays'.format_map(horse))
print()
print("This is about all on the methods of str.")
print("=" * 80)