Python Strings


In [1]:
# this is an empty string
empty_str = ''

In [2]:
# create a string
str1 = '   the quick brown fox jumps over the lazy dog. '
str1


Out[2]:
'   the quick brown fox jumps over the lazy dog. '

In [3]:
# strip whitespaces from the beginning and ending of the string
str2=str1.strip()
print str2
print str1


the quick brown fox jumps over the lazy dog.
   the quick brown fox jumps over the lazy dog. 

In [4]:
# this capitalizes the 1st letter of the string
str2.capitalize()


Out[4]:
'The quick brown fox jumps over the lazy dog.'

In [6]:
# count the number of occurrences for the string o
str2.count('o')


Out[6]:
4

In [7]:
# check if a string ends with a certain character
str2.endswith('.')


Out[7]:
True

In [10]:
# check if a substring exists in the string
'jum' in str2


Out[10]:
True

In [9]:
# find the index of the first occurrence  
str2.find('fox')


Out[9]:
16

In [11]:
# let's see what character is at index 19
str2[19]


Out[11]:
' '

Note: strings are immutable while lists are not. In other words immutability does not allow for in-place modification of the object.

"Also notice in the prior examples that we were not changing the original string with any of the operations we ran on it. Every string operation is defined to produce a new string as its result, because strings are immutable in Python—they cannot be chang ed in place after they are created. In other words, you can never overwrite the values of immutable objects. For example, you can’t change a string by assigning to one of its positions, but you can always build a new one and assign it to the same name. Because Python cleans up old objects as you go (as you’ll see later), this isn’t as inefficient as it may sound:"

In [12]:
S = 'shrubbery'
S[1]='c'


---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-12-7d433c7a58ff> in <module>()
      1 S = 'shrubbery'
----> 2 S[1]='c'

TypeError: 'str' object does not support item assignment


In [13]:
S = 'shrubbery' 
print "length of string is : ", len(S)
L = list(S)
print "L = ", L


length of string is :  9
L =  ['s', 'h', 'r', 'u', 'b', 'b', 'e', 'r', 'y']

In [14]:
S[0]


Out[14]:
's'

In [15]:
L[1] = 'c'
print L
S1 = ''.join(L)
print 'this is S1:', S1
print S


['s', 'c', 'r', 'u', 'b', 'b', 'e', 'r', 'y']
this is S1: scrubbery
shrubbery

In [16]:
for x in S:
    print x


s
h
r
u
b
b
e
r
y

In [17]:
# another way of changing the string 
S = S[0] + 'c' + S[2:] # string concatenation
S


Out[17]:
'scrubbery'

In [18]:
# 
line = 'aaa,bbb,cccc c,dd'
line1 = line.split(',') 
print line
print line1


aaa,bbb,cccc c,dd
['aaa', 'bbb', 'cccc c', 'dd']

In [21]:
# list the methods and attributes for string operations
dir(S)


Out[21]:
['__add__',
 '__class__',
 '__contains__',
 '__delattr__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__getnewargs__',
 '__getslice__',
 '__gt__',
 '__hash__',
 '__init__',
 '__le__',
 '__len__',
 '__lt__',
 '__mod__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__rmod__',
 '__rmul__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '_formatter_field_name_split',
 '_formatter_parser',
 'capitalize',
 'center',
 'count',
 'decode',
 'encode',
 'endswith',
 'expandtabs',
 'find',
 'format',
 'index',
 'isalnum',
 'isalpha',
 'isdigit',
 'islower',
 'isspace',
 'istitle',
 'isupper',
 'join',
 'ljust',
 'lower',
 'lstrip',
 'partition',
 'replace',
 'rfind',
 'rindex',
 'rjust',
 'rpartition',
 'rsplit',
 'rstrip',
 'split',
 'splitlines',
 'startswith',
 'strip',
 'swapcase',
 'title',
 'translate',
 'upper',
 'zfill']

In [20]:
help(S.split)


Help on built-in function split:

split(...)
    S.split([sep [,maxsplit]]) -> list of strings
    
    Return a list of the words in the string S, using sep as the
    delimiter string.  If maxsplit is given, at most maxsplit
    splits are done. If sep is not specified or is None, any
    whitespace string is a separator and empty strings are removed
    from the result.


In [ ]:
ord(S[0])

Formatting


In [22]:
# define variables
x = 3.1415926
y = 1

In [23]:
# 2 decimal places 
print "{:.2f}".format(x)


3.14

In [24]:
# 2 decimal palces with sign
print "{:+.2f}".format(x)


+3.14

In [25]:
# 2 decimal palces with sign
print "{:.2f}".format(-y)


-1.00

In [26]:
# print with no decimal palces
print "{:.0f}".format(3.51)


4

In [27]:
# left padded with 0's - width 4
print "{:0>4d}".format(11)


0011

In [28]:
for i in range(20):
    print "{:0>4d}".format(i)


0000
0001
0002
0003
0004
0005
0006
0007
0008
0009
0010
0011
0012
0013
0014
0015
0016
0017
0018
0019

In [29]:
# right padd with x's - total width 4
print "{:x<4d}".format(33)


33xx

In [30]:
# right padd with x's - total width 4
print y
print "{:x<4d}".format(10*y)


1
10xx

In [31]:
# insert a comma separator
print "{:,}".format(10000000000000)


10,000,000,000,000

In [32]:
# % format
print "{:.4%}".format(0.1235676)


12.3568%

In [33]:
# exponent notation
print "{:.3e}".format(10000000000000)


1.000e+13

In [34]:
#  right justified, with 10
print '1234567890' # for place holders
print "{:10d}".format(10000000)


1234567890
  10000000

In [35]:
#  left justified, with 10
print '12345678901234567890' # place holder
print "{:<10d}".format(100), "{:<10d}".format(100)


12345678901234567890
100        100       

In [36]:
#  center justified, with 10
print '1234567890'
print "{:^10d}".format(100)


1234567890
   100    

In [37]:
# string substitution
s1 = 'so much depends upon {}'.format('a red wheel barrow')
s2 = 'glazed with {} water beside the {} chickens'.format('rain', 'white')
print s1
print s2


so much depends upon a red wheel barrow
glazed with rain water beside the white chickens

In [38]:
# another substitution
s1 = " {0} is better than {1} ".format("emacs", "vim")
s2 = " {1} is better than {0} ".format("emacs", "vim")
print s1
print s2


 emacs is better than vim 
 vim is better than emacs 

In [39]:
## defining formats
email_f = "Your email address was {email}".format
print email_f
## use elsewhere
var1 = "bob@example.com"
var2 = 'a@cox.net'
var3 = 'b@cox.net'
print(email_f(email=var1))
print(email_f(email=var2))
print(email_f(email=var3))


<built-in method format of str object at 0x1061caa08>
Your email address was bob@example.com
Your email address was a@cox.net
Your email address was b@cox.net