A "string" is a series of characters of arbitrary length. Strings are immutable - they cannot be changed once created. When you modify a string, you automatically make a copy and modify the copy.
In [2]:
s1 = 'Godzilla'
print s1, s1.upper(), s1
A "literal" is essentially a string constant, already spelled out for you. Python uses either on output, but that's just for formatting simplicity.
In [3]:
"Godzilla"
Out[3]:
Generally, a string literal can be in single ('), double ("), or triple (''') quotes. Single and double quotes are equivalent - use whichever you prefer (but be consistent). If you need to have a single or double quote in your literal, surround your literal with the other type, or use the backslash to escape the quote.
In [4]:
"Godzilla's a kaiju."
Out[4]:
In [5]:
'Godzilla\'s a kaiju.'
Out[5]:
In [6]:
'We call him... "Godzilla".'
Out[6]:
Triple quotes are a special form of quoting used for documenting your Python files (docstrings). We won't discuss that type here.
Raw strings don't use any escape character interpretation. Use them when you have a complicated string that you don't want to clutter with lots of backslashes. Python puts them in for you.
In [7]:
print('This is a\ncomplicated string with newline escapes in it.')
In [8]:
print(r'This is a\ncomplicated string with newline escapes in it.')
In [12]:
x=int('122', 3)
x+1
Out[12]:
String objects are just the string variables you create in Python.
In [13]:
kaiju = 'Godzilla'
print(kaiju)
In [14]:
kaiju
Out[14]:
Note the print() call shows no quotes, while the simple variable name did. That is a Python output convention. Just entering the name will call the repr() method, which displays the value of the argument as Python would see it when it reads it in, not as the user wants it.
In [15]:
repr(kaiju)
Out[15]:
In [16]:
print(repr(kaiju))
When you read text from a file, it's just that - text. No matter what the data represents, it's still text. To use it as a number, you have to explicitly convert it to a number.
In [17]:
one = 1
two = '2'
print one, two, one + two
In [18]:
one = 1
two = int('2')
print one, two, one + two
In [19]:
num1 = 1.1
num2 = float('2.2')
print num1, num2, num1 + num2
You can also do this with hexadecimal and octal numbers, or any other base, for that matter.
In [20]:
print int('FF', 16)
print int('0xff', 16)
print int('777', 8)
print int('0777', 8)
print int('222', 7)
print int('110111001', 2)
If the conversion cannot be done, an exception is thrown.
In [21]:
print int('0xGG', 16)
In [22]:
kaiju1 = 'Godzilla'
kaiju2 = 'Mothra'
kaiju1 + ' versus ' + kaiju2
Out[22]:
In [23]:
'Run away! ' * 3
Out[23]:
NOTE: This particular statement is false regardless of how the statement is evaluated! :^)
In [24]:
'Godzilla' in 'Godzilla vs Gamera'
Out[24]:
In [25]:
len(kaiju)
Out[25]:
Remember - methods are functions attached to objects, accessed via the 'dot' notation.
In [26]:
kaiju.capitalize()
Out[26]:
In [27]:
kaiju.lower()
Out[27]:
In [28]:
kaiju.upper()
Out[28]:
In [29]:
kaiju.swapcase()
Out[29]:
In [30]:
'godzilla, king of the monsters'.title()
Out[30]:
In [31]:
kaiju.center(20, '*')
Out[31]:
In [32]:
kaiju.ljust(20, '*')
Out[32]:
In [33]:
kaiju.rjust(20, '*')
Out[33]:
In [34]:
tabbed_kaiju = '\tGodzilla'
print('[' + tabbed_kaiju + ']')
In [35]:
print('[' + tabbed_kaiju.expandtabs(16) + ']')
In [36]:
' vs '.join(['Godzilla', 'Hedorah'])
Out[36]:
In [37]:
','.join(['Godzilla', 'Mothra', 'King Ghidorah'])
Out[37]:
In [38]:
' Godzilla '.strip()
Out[38]:
In [39]:
'xxxGodzillayyy'.strip('xy')
Out[39]:
In [40]:
' Godzilla '.lstrip()
Out[40]:
In [41]:
' Godzilla '.rstrip()
Out[41]:
In [42]:
battle = 'Godzilla x Gigan'
battle.partition(' x ')
Out[42]:
In [43]:
battle = 'Godzilla and Jet Jaguar vs. Gigan and Megalon'
battle.partition(' vs. ')
Out[43]:
In [44]:
battle = 'Godzilla vs Megalon vs Jet Jaguar'
battle.partition('vs')
Out[44]:
In [45]:
battle = 'Godzilla vs Megalon vs Jet Jaguar'
battle.rpartition('vs')
Out[45]:
In [46]:
battle = 'Godzilla vs Mothra'
battle.replace('Mothra', 'Anguiras')
Out[46]:
In [47]:
battle = 'Godzilla vs a monster and another monster'
battle.replace('monster', 'kaiju', 2)
Out[47]:
In [48]:
battle = 'Godzilla vs a monster and another monster and yet another monster'
battle.replace('monster', 'kaiju', 2)
Out[48]:
In [49]:
battle = 'Godzilla vs King Ghidorah vs Mothra'
battle.split(' vs ')
Out[49]:
In [51]:
kaijus = 'Godzilla,Mothra,King Ghidorah'
kaijus.split(',')
Out[51]:
In [52]:
kaijus = 'Godzilla Mothra King Ghidorah'
kaijus.split()
Out[52]:
In [53]:
kaijus = 'Godzilla,Mothra,King Ghidorah,Megalon'
kaijus.rsplit(',', 2)
Out[53]:
In [54]:
kaijus_in_lines = 'Godzilla\nMothra\nKing Ghidorah\nEbirah'
print(kaijus_in_lines)
In [55]:
kaijus_in_lines.splitlines()
Out[55]:
In [56]:
kaijus_in_lines.splitlines(True)
Out[56]:
In [57]:
age_of_Godzilla = 60
age_string = str(age_of_Godzilla)
print(age_string, age_string.zfill(5))
In [58]:
print('Godzilla'.isalnum())
print('*Godzilla*'.isalnum())
print('Godzilla123'.isalnum())
In [59]:
print('Godzilla'.isalpha())
print('Godzilla123'.isalpha())
In [60]:
print('Godzilla'.isdigit())
print('60'.isdigit())
In [61]:
print('SpaceGodzilla'.isspace())
print(' '.isspace())
In [62]:
print('Godzilla'.islower())
print('godzilla'.islower())
In [63]:
print('Godzilla'.isupper())
print('GODZILLA'.isupper())
In [64]:
print('Godzilla vs Mothra'.istitle())
print('Godzilla X Mothra'.istitle())
In [65]:
monsters = 'Godzilla and Space Godzilla and MechaGodzilla'
print 'There are ', monsters.count('Godzilla'), ' Godzillas.'
print 'There are ', monsters.count('Godzilla', len('Godzilla')), ' pseudo-Godzillas.'
In [66]:
king_kaiju = 'Godzilla'
print king_kaiju.startswith('God')
print king_kaiju.endswith('lla')
print king_kaiju.startswith('G')
print king_kaiju.endswith('amera')
In [67]:
kaiju_string = 'Godzilla,Gamera,Gorgo,Space Godzilla'
print 'The first Godz is at position', kaiju_string.find('Godz')
print 'The second Godz is at position', kaiju_string.find('Godz', len('Godz'))
In [42]:
kaiju_string.index('Minilla')
In [44]:
kaiju_string.rindex('Godzilla')
Out[44]:
Used to convert strings to/from Unicode and other systems. Rarely used in science code.
Similar to formatting in C, FORTRAN, etc.. There is a lot more to this than I am showing here.
In [111]:
kaiju = 'Godzilla'
age = 60
print '%s is %d years old.' % (kaiju, age)
The string module is the Python equivalent of "junk DNA" in living organisms. It's been around since the beginning, but many of its functions have been superseded by evolution. But some ancient code still relies on it, so they leave the old parts in....
For modern code, the string module does have some useful constants and functions.
In [68]:
import string
In [69]:
print string.ascii_letters
print string.ascii_lowercase
print string.ascii_uppercase
In [70]:
print string.digits
print string.hexdigits
print string.octdigits
In [71]:
print string.letters
print string.lowercase
print string.uppercase
In [72]:
print string.printable
print string.punctuation
print string.whitespace
The string module also provides the Formatter class, which can be useful for sophisticated text formatting.
Regular expressions ('regexps') are essentially a mini-language for describing string operations. Everything shown above with string methods and operators can be done with regular expressions. Most of the time, the regular expression verrsion is more concise. But not always more readable....
To use regular expressions, you have to import the 're' module.
In [73]:
import re
In [3]:
kaiju_truth = 'Godzilla is the King of the Monsters. Ebirah is also a monster, but looks like a giant lobster.'
re.findall('Godz', kaiju_truth)
Out[3]:
In [18]:
print re.findall('(^.+) is the King', kaiju_truth)
For simple searches like this, using in() is typically easier. Regexps are by default case-sensitive.
In [21]:
print re.findall('\. (.+) is also', kaiju_truth)
In [39]:
print re.findall('(.+) is also a (.+)', kaiju_truth)[0]
print re.findall('\. (.+) is also a (.+),', kaiju_truth)[0]
In [10]:
some_kaiju = 'Godzilla, Space Godzilla, Mechagodzilla'
print re.sub('Godzilla', 'Gamera', some_kaiju)
print re.sub('(?i)Godzilla', 'Gamera', some_kaiju)
You could spend a whole day (or more) just learning about regular expressions. But they are incredibly useful and powerful, especially in the all-to-frequent drudgery of munging files from one format to another.
Regular expressions can be internally compiled for speed.