Introduction to Python and Natural Language Technologies

Type system and built-in types

Lecture 02

13 September 2017

PEP8, the Python style guide

widely accepted style guide for Python
PEP8 by Guido himself, 2001

Specifies:

indentation
line length
module imports
class names, function names etc.

We shall use PEP8 throughout this course.

Type system

Dynamic type system

no need to declare variables
the = operator binds a reference to any arbitrary object



In [1]:

    
i = 2
type(i), id(i)









    Out[1]:





(int, 139967066462624)



In [2]:

    
i = "foo"
type(i), id(i)









    Out[2]:





(str, 139967000318952)

Strongly typed

most implicit conversions are disallowed.
conversions between numeric types are OK:



In [3]:

    
i = 2
f = 1.2
s = i + f
print(type(i), type(f))
print(type(i + f))
print(s)









    



<class 'int'> <class 'float'>
<class 'float'>
3.2

conversions between numeric types and string are not allowed



In [4]:

    
# print("I am " + 20 + " years old")

we must explicitely cast it:



In [5]:

    
print("I am " + str(20) + " years old")









    



I am 20 years old

Note that many other languages like Javascript allow implicit casting:



In [6]:

    
%%javascript

element.text("I am " + 20 + " years old")



In [7]:

    
%%javascript

element.text("1" + 1)



In [8]:

    
%%javascript

element.text(1 + "1")



In [9]:

    
%%perl

print "1" + 1

Built-in types and operators

boolean operators

three boolean operators: and, or and not



In [10]:

    
x = 2

x < 2 or x >= 2









    Out[10]:





True



In [11]:

    
x > 0 and x < 10









    Out[11]:





True



In [12]:

    
not x < 0









    Out[12]:





True

boolean type

two boolean values: True and False (must be capitalized)



In [13]:

    
x = True
type(x)









    Out[13]:





bool



In [14]:

    
True and False









    Out[14]:





False



In [15]:

    
True or False









    Out[15]:





True



In [16]:

    
not True









    Out[16]:





False

Numeric types

three numeric types: int, float and complex
an object's type is derived from its initial value



In [17]:

    
i = 2
f = 1.2
c = 1+2j

type(i), type(f), type(c)









    Out[17]:





(int, float, complex)

implicit conversion between numeric types is supported in arithmetic operations
the resulting type is the one with less data loss



In [18]:

    
c2 = i + c
print(c2, type(c2))









    



(3+2j) <class 'complex'>

Precision and range

integers have unlimited precision
different from Python2, where integers are usually implemented using C's long type



In [19]:

    
%%python2

import sys
import math

max_i = sys.maxint
print(max_i, math.log(max_i, 2))









    



(9223372036854775807, 63.0)

in Python2 numbers are automatically converted to Python's long type when they exceed maxint



In [20]:

    
%%python2
import sys

max_i = sys.maxint
print(type(max_i), type(max_i+1))









    



(<type 'int'>, <type 'long'>)

Python3



In [21]:

    
type(2**63 + 1)









    Out[21]:





int

float

floats are usually implemented using C's double.
complex numbers use two floats for their real and imaginary parts
check sys.float_info for more information



In [22]:

    
import sys
sys.float_info









    Out[22]:





sys.float_info(max=1.7976931348623157e+308, max_exp=1024, max_10_exp=308, min=2.2250738585072014e-308, min_exp=-1021, min_10_exp=-307, dig=15, mant_dig=53, epsilon=2.220446049250313e-16, radix=2, rounds=1)



In [23]:

    
sys.int_info









    Out[23]:





sys.int_info(bits_per_digit=30, sizeof_digit=4)

Arithmetic operators

addition, subtraction and product work the same as in C/C++



In [24]:

    
i = 2
f = 4.2
c = 4.1-3j

s1 = i + f
s2 = f - c
s3 = i * c
print(s1, type(s1))
print(s2, type(s2))
print(s3, type(s3))









    



6.2 <class 'float'>
(0.10000000000000053+3j) <class 'complex'>
(8.2-6j) <class 'complex'>

quotient operator

Python 2 vs. 3 difference

in Python3 operator/ computes the float quotient even if the operands are both integers



In [25]:

    
-3 / 2









    Out[25]:





-1.5

in Python2 the quotient is integer if both operands are integers
the result is the floor of the quotient



In [26]:

    
%%python2

q = -3 / 2
print(q, type(q))









    



(-2, <type 'int'>)

this is not the case if one or both of the operands are float



In [27]:

    
%%python2

q = -3.0 / 2
print(q, type(q))









    



(-1.5, <type 'float'>)

Explicit floor quotient operator



In [28]:

    
-3.0 // 2









    Out[28]:





-2.0

Comparison operators



In [29]:

    
x = 23
x < 24









    Out[29]:





True



In [30]:

    
x >= 22









    Out[30]:





True

operators can be chained



In [31]:

    
23 < x < 100









    Out[31]:





False



In [32]:

    
23 <= x < 100









    Out[32]:





True

Other operators for numeric types

remainder



In [33]:

    
5 % 3









    Out[33]:





2

power



In [34]:

    
2 ** 3









    Out[34]:





8



In [35]:

    
2 ** 0.5









    Out[35]:





1.4142135623730951

absolute value



In [36]:

    
abs(-2 - 1j)









    Out[36]:





2.23606797749979

round



In [37]:

    
round(2.3456), round(2.3456, 2)









    Out[37]:





(2, 2.35)

Explicit conversions between numeric types



In [38]:

    
float(2)









    Out[38]:





2.0



In [39]:

    
int(2.5)









    Out[39]:





2

`math` and `cmath`

additional operations
cmath is for complex numbers



In [40]:

    
import math

math.log(16), math.log(16, 2), math.exp(2), \
math.exp(math.log(10))









    Out[40]:





(2.772588722239781, 4.0, 7.38905609893065, 10.000000000000002)

Mutable vs. immutable types

instances of mutable types can be modified in place
immutable objects have the same value during their lifetime

are numeric types mutable or immutable?



In [41]:

    
x = 2
old_id = id(x)
x += 1
print(id(x) == old_id)









    



False

How about booleans?



In [42]:

    
x = True
y = False
print(id(x) == id(y))
x = False
print(id(x) == id(y))









    



False
True

There is only one True and one False object.

How about lists?



In [43]:

    
l1 = [0, 1]
old_id = id(l1)
l1.append(2)
old_id == id(l1)









    Out[43]:





True

Lists are mutable.

Sequence types

all sequences support the following basic operations

operation	behaviour
`x in s`	True if an item of s is equal to x, else False
`x not in s`	False if an item of s is equal to x, else True
`s + t`	the concatenation of s and t
`s * n or n * s`	equivalent to adding s to itself n times
`s[i]`	ith item of s, origin 0
`s[i:j]`	slice of s from i to j
`s[i:j:k]`	slice of s from i to j with step k
`len(s)`	length of s
`min(s)`	smallest item of s
`max(s)`	largest item of s
`s.index(x[, i[, j]])`	index of the first occurrence of x in s (at or after index i and before index j)
`s.count(x)`	total number of occurrences of x in s

Table source

list

mutable sequence type



In [44]:

    
l = [1, 2, 2, 3]
l









    Out[44]:





[1, 2, 2, 3]



In [45]:

    
l[1]









    Out[45]:





2



In [46]:

    
# l[4]  # raises IndexError



In [47]:

    
l[-1]









    Out[47]:





3

Advanced indexing, ranges



In [48]:

    
l = []
for i in range(20):
    l.append(2*i + 1)
l[:10]









    Out[48]:





[1, 3, 5, 7, 9, 11, 13, 15, 17, 19]



In [49]:

    
l[-4:]









    Out[49]:





[33, 35, 37, 39]



In [50]:

    
i = 2
l[i:i+3]









    Out[50]:





[5, 7, 9]



In [51]:

    
l[2:10:3]









    Out[51]:





[5, 11, 17]

Q. What is the output?



In [52]:

    
l[::-1]









    Out[52]:





[39, 37, 35, 33, 31, 29, 27, 25, 23, 21, 19, 17, 15, 13, 11, 9, 7, 5, 3, 1]

Lists are mutable, elements can be added or changed:



In [53]:

    
l = []
l.append(1)
l.append(2)
l.append(2)
l









    Out[53]:





[1, 2, 2]



In [54]:

    
l[1] = 12



In [55]:

    
l.extend([3, 4, 5])
len(l), l









    Out[55]:





(6, [1, 12, 2, 3, 4, 5])

elements don't need to be of the same type



In [56]:

    
l = [1, -1, "foo", 2, "bar"]
l









    Out[56]:





[1, -1, 'foo', 2, 'bar']

lists can be traversed with for loops



In [57]:

    
for element in l:
    print(element)









    



1
-1
foo
2
bar

enumerate

if we need the indices too, the built-in enumerate function iterates over index-element pairs



In [58]:

    
for i, element in enumerate(l):
    print(i, element)









    



0 1
1 -1
2 foo
3 2
4 bar

List sorting

lists can be sorted using the built-in sorted function



In [59]:

    
l = [3, -1, 2, 11]

for e in sorted(l):
    print(e)

the sorting key can be specified using the key argument



In [60]:

    
shopping_list = [
    ["apple", 5],
    ["pear", 2],
    ["milk", 1],
    ["bread", 3],
]

for product in sorted(shopping_list, key=lambda x: -x[1]):
    print(product)









    



['apple', 5]
['bread', 3]
['pear', 2]
['milk', 1]

tuple

tuple is an immutable sequence
it may be empty



In [61]:

    
t = ()
print(type(t), len(t))

t = ([1, 2, 3], "foo")
type(t), len(t)









    



<class 'tuple'> 0






    Out[61]:





(tuple, 2)



In [62]:

    
t









    Out[62]:





([1, 2, 3], 'foo')

tuples can be indexed the same way as lists



In [63]:

    
t[1], t[-1]









    Out[63]:





('foo', 'foo')

tuples contain immutable references, however, the objects may be mutable



In [64]:

    
t = ([1, 2, 3], "foo")
# t[0]= "bar"  # this raises a TypeError



In [65]:

    
for e in t:
    print(id(e))
    
print("\nChanging an element of t[0]\n")
t[0][1] = 11

for e in t:
    print(id(e))

print("\n", t)









    



139966880293064
139967000318952

Changing an element of t[0]

139966880293064
139967000318952

 ([1, 11, 3], 'foo')

dictionary

basic and only built-in map type
maps keys to values



In [66]:

    
d = {}  # empty dictionary  same as d = dict()
d["apple"] = 12
d["plum"] = 2
d









    Out[66]:





{'apple': 12, 'plum': 2}

equivalent to



In [67]:

    
d = {"apple": 12, "plum": 2}
d









    Out[67]:





{'apple': 12, 'plum': 2}

removing keys



In [68]:

    
del d["apple"]
d









    Out[68]:





{'plum': 2}

iterating dictionaries

keys and values can be iterated separately or together
iterating keys



In [69]:

    
d = {"apple": 12, "plum": 2}
for key in d.keys():
    print(key, d[key])









    



apple 12
plum 2

iterating values



In [70]:

    
for value in d.values():
    print(value)

iterating both



In [71]:

    
for key, value in d.items():
    print(key, value)









    



apple 12
plum 2

Under the hood

uses hash table (same as C++'s std::unordered_map)
- constraints on key values: they must be hashable i.e. they cannot be or contain mutable objects
- keys can be mixed type



In [72]:

    
d = {}
d[1] = "a"  # numeric types are immutable
d[3+2j] = "b"
d["c"] = 1.0
d









    Out[72]:





{1: 'a', (3+2j): 'b', 'c': 1.0}

tuples are immutable too



In [73]:

    
d[("apple", 1)] = -2
d









    Out[73]:





{1: 'a', (3+2j): 'b', 'c': 1.0, ('apple', 1): -2}

however lists are not



In [74]:

    
# d[["apple", 1]] = 12  # raises TypeError

Q. Can these be dictionary keys?



In [75]:

    
key1 = (2, (3, 4))
key2 = (2, [], (3, 4))

# d = {}
# d[key1] = 1
# d[key2] = 2

set

collection of unique, hashable elements
implements basic set operations (intersection, union, difference)



In [76]:

    
s = set()
s.add(2)
s.add(3)
s.add(2)
s









    Out[76]:





{2, 3}



In [77]:

    
s = {2, 3, 2}
type(s), s









    Out[77]:





(set, {2, 3})

deleting elements



In [78]:

    
s.add(2)
s.remove(2)
# s.remove(2)  # raises KeyError, since we already removed this element
s.discard(2)  # removes if present, does not raise exception

frozenset

immutable counterpart of set



In [79]:

    
fs = frozenset([1, 2])
# fs.add(2)  # raises AttributeError



In [80]:

    
fs = frozenset([1, 2])
s = {1, 2}

d = dict()
d[fs] = 1
# d[s] = 2  # raises TypeError
d









    Out[80]:





{frozenset({1, 2}): 1}

set operations

implemented as
1. methods
2. overloaded operators



In [81]:

    
s1 = {1, 2, 3, 4, 5}
s2 = {2, 5, 6, 7}

s1 & s2  # s1.intersection(s2) or s2.intersection(s1)









    Out[81]:





{2, 5}



In [82]:

    
s1 | s2  # s1.union(s2) OR s2.union(s1)









    Out[82]:





{1, 2, 3, 4, 5, 6, 7}



In [83]:

    
s1 - s2, s2 - s1  # s1.difference(s2), s2.difference(s1)









    Out[83]:





({1, 3, 4}, {6, 7})

these operations return new sets



In [84]:

    
s3 = s1 & s2
type(s3), id(s3) == id(s1), id(s3) == id(s2)









    Out[84]:





(set, False, False)

`issubset` and `issuperset`



In [85]:

    
s1 < s2, s1.issubset(s2)









    Out[85]:





(False, False)



In [86]:

    
{1, 2} < s1









    Out[86]:





True



In [87]:

    
s1.issuperset({1, 6})









    Out[87]:





False

useful set properties

creating a set is a convenient way of getting the unique elements of a sequence



In [88]:

    
l = [1, 2, 3, -1, 1, 2, 1, 0]
uniq = set(l)
uniq









    Out[88]:





{-1, 0, 1, 2, 3}

sets and dictionaries provide O(1) lookup
in contrast lists provide O(n) lookup



In [143]:

    
import random

n = 10000

l = list(range(n))
random.shuffle(l)
s = set(l)
len(s), len(l)









    Out[143]:





(10000, 10000)



In [150]:

    
{2, -1, 12, "aa", "bb"}









    Out[150]:





{2, 12, 'bb', 'aa', -1}



In [152]:

    
{"aa", "cc", 2, "bb"}









    Out[152]:





{'aa', 2, 'cc', 'bb'}



In [90]:

    
%%timeit

2 in l









    



144 µs ± 4.23 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)



In [91]:

    
%%timeit

2 in s









    



50.8 ns ± 6.08 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

Functions with variable number of arguments

recall that functions may take positional and keyword arguments



In [92]:

    
def foo(arg1, arg2, arg3):
    print(arg1, arg2, arg3)
    
foo(1, 2, 3)
foo(1, arg3=2, arg2=3)

arguments may have default values starting from the rightmost argument



In [93]:

    
def foo(arg1, arg2, arg3=12):
    print(arg1, arg2, arg3)
    
foo(-1, -4)

`args` and `kwargs`

both positional and keyword arguments can be captured in arbitrary numbers using the * and ** operators
positional arguments are captured in a tuple



In [94]:

    
def arbitrary_positional_f(*args):
    print(type(args))
    for arg in args:
        print(arg)
        
arbitrary_positional_f(1, 2, -1)
# arbitrary_positional_f(1, 2, arg=-1)  # raises TypeError









    



<class 'tuple'>
1
2
-1

keyword arguments are captured in a dictionary



In [95]:

    
def arbitrary_keyword_f(**kwargs):
    print(type(kwargs))
    for argname, value in kwargs.items():
        print(argname, value)
        
arbitrary_keyword_f(arg1=1, arg2=12)
# arbitrary_keyword_f(12, arg=12)  # TypeError









    



<class 'dict'>
arg1 1
arg2 12

we usually capture both



In [96]:

    
def arbitrary_arg_f(*args, **kwargs):
    if args:
        print("Positional arguments")
        for arg in args:
            print(arg)
    else:
        print("No positional arguments")
    if kwargs:
        print("Keyword arguments")
        for argname, value in kwargs.items():
            print(argname, value)
    else:
        print("No keyword arguments")
        
arbitrary_arg_f()
arbitrary_arg_f(12, -2, param1="foo")









    



No positional arguments
No keyword arguments
Positional arguments
12
-2
Keyword arguments
param1 foo

Mutable default arguments

be careful with mutable default arguments



In [97]:

    
def insert_value(value, l=[]):
    l.append(value)
    print(l)
    
l1 = []
insert_value(12, l1)
l2 = []
insert_value(14, l2)









    



[12]
[14]



In [98]:

    
insert_value(-1)









    



[-1]



In [99]:

    
insert_value(-3)









    



[-1, -3]

best not to use mutable defaults

One solution is to create a new list inside a function:



In [100]:

    
def insert_value(value, l=None):
    if l is None:
        l = []
    l.append(value)
    return l

l = insert_value(2)
l









    Out[100]:





[2]



In [101]:

    
insert_value(12)









    Out[101]:





[12]

Lambda expressions

unnamed functions
may take parameters
can access local scope



In [102]:

    
l = [-1, 0, -10, 2, 3]

Let's sort this list by absolute value. The built-in sorted's default behavior is not enough right now:



In [103]:

    
for e in sorted(l):
    print(e)

but we can specify the key to use



In [104]:

    
for e in sorted(l, key=lambda x : abs(x)):
    print(e)

we can use any callable as the key



In [105]:

    
for e in sorted(l, key=abs):
    print(e)

Strings

strings are immutable sequences of Unicode code points
strings can be constructed in various ways



In [106]:

    
single = 'abc'
double = "abc"
single == double









    Out[106]:





True

immutability makes it impossible to change a string (unlike C-style string or C++'s std::string)



In [107]:

    
s = "abc"

# s[1] = "c"  # TypeError

all string operations create new objects



In [108]:

    
print(id(s))
s += "def"
id(s)









    



139967073665128






    Out[108]:





139966880316808

strings support most operations available for lists
- advanced indexing



In [109]:

    
s = "abcdefghijkl"
s[::2]









    Out[109]:





'acegik'

Character encodings - Unicode

Unicode provides a mapping from letters to code points or numbers

character	Unicode code point
a	U+0061
ő	U+0151
ش	U+0634
گ	U+06AF
¿	U+00BF
ư	U+01B0
Ң	U+04A2
⛵	U+26F5

actual text needs to be stored as a byte array/sequence (byte strings)
character encoding: code point - byte array correspondence

encoding: Unicode code point $\rightarrow$ byte sequence
decoding: byte sequence $\rightarrow$ Unicode code point
most popular encoding: UTF-8

character	Unicode code point	UTF-8 byte sequence
a	U+0061	61
ő	U+0151	C5 91
ش	U+0634	D8 B4
گ	U+06AF	DA AF
¿	U+00BF	C2 BF
ư	U+01B0	C6 B0
Ң	U+04A2	D2 A2
⛵	U+26F5	E2 9B B5

Python3 automatically encodes Unicode strings when:

writing to file
printing
any kind of operation that requires byte string conversion



In [110]:

    
s = "ábc"
print(type(s))

with open("file.txt", "w") as f:
    f.write(s)
    f.write("\n")









    



<class 'str'>

and automatically decodes byte sequnces when reading from file:



In [111]:

    
with open("file.txt") as f:
    text = f.read().strip()
    
print(text)
type(text)









    



ábc






    Out[111]:





str

Python 2 does not do this automatically



In [112]:

    
%%python2

with open("file.txt") as f:
    text = f.read().strip()
    
print(text)
print(text[0], len(text), type(text))









    



ábc
('\xc3', 4, <type 'str'>)

Python 2's str type is a byte string, different from Python 3's str which is a Unicode string
Python 2 has a separate unicode type for Unicode strings
we need to manually decode the byte string

Manual string decoding in Python 2



In [113]:

    
%%python2

with open("file.txt") as f:
    text = f.read().decode('utf8').strip()
    
print(type(text), len(text))
print(text[0].encode('utf8'))









    



(<type 'unicode'>, 3)
á

we also need to manually encode the text



In [114]:

    
%%python2

with open("file.txt") as f:
    text = f.read().decode('utf8').strip()
    
print(text[0].encode('utf8'))
# print(text)  # raise UnicodeEncodeError - can you interpret the error message?
print(text.encode('utf8'))









    



á
ábc

`bytes` in Python 3

Python 3 has a different type called bytes for byte strings
one way to get a byte string is to encode a Unicode string



In [115]:

    
unicode_string = "ábc"
utf8_string = unicode_string.encode("utf8")
latin2_string = unicode_string.encode("latin2")

type(unicode_string), type(utf8_string), type(latin2_string)









    Out[115]:





(str, bytes, bytes)



In [116]:

    
len(unicode_string), len(utf8_string), len(latin2_string)









    Out[116]:





(3, 4, 3)

String operations

most sequence operations are supported
large variety of basic string manipulation: lower, upper, title



In [117]:

    
"abC".upper(), "ABC".lower(), "abc".title()









    Out[117]:





('ABC', 'abc', 'Abc')



In [118]:

    
s = "\tabc  \n"
print("<START>" + s + "<STOP>")









    



<START>	abc  
<STOP>



In [119]:

    
s.strip()









    Out[119]:





'abc'



In [120]:

    
s.rstrip()









    Out[120]:





'\tabc'



In [121]:

    
s.lstrip()









    Out[121]:





'abc  \n'



In [122]:

    
"abca".strip("a")









    Out[122]:





'bc'

since each function returns a new string, they can be chained after another



In [123]:

    
" abcd abc".strip().rstrip("c").lstrip("ab")









    Out[123]:





'cd ab'

Binary predicates (i.e. yes-no questions)



In [124]:

    
"abc".startswith("ab"), "abc".endswith("cd")









    Out[124]:





(True, False)



In [125]:

    
"abc".istitle(), "Abc".istitle()









    Out[125]:





(False, True)



In [126]:

    
"  \t\n".isspace()









    Out[126]:





True



In [127]:

    
"989".isdigit()









    Out[127]:





True

split and join



In [128]:

    
s = "the quick brown fox jumps over the lazy dog"
words = s.split()
words









    Out[128]:





['the', 'quick', 'brown', 'fox', 'jumps', 'over', 'the', 'lazy', 'dog']



In [129]:

    
s = "R.E.M."
s.split(".")









    Out[129]:





['R', 'E', 'M', '']



In [130]:

    
"-".join(words)









    Out[130]:





'the-quick-brown-fox-jumps-over-the-lazy-dog'

use explicit token separators



In [131]:

    
" <W> ".join(words)









    Out[131]:





'the <W> quick <W> brown <W> fox <W> jumps <W> over <W> the <W> lazy <W> dog'

Q. Compute the word frequencies in a Wikipedia article.



In [132]:

    
import urllib.request
wp_url = "https://en.wikipedia.org/wiki/Budapest"
text = urllib.request.urlopen(wp_url).read()
text = text.decode('utf8')



In [133]:

    
words = text.split()
len(words), len(set(words))









    Out[133]:





(45827, 16575)



In [134]:

    
word_freq = {}
for word in words:
    if word not in word_freq:
        word_freq[word] = 1
    else:
        word_freq[word] += 1



In [ ]:



In [135]:

    
for word, freq in sorted(word_freq.items(), key=lambda x: -x[1])[:20]:
    print(word, freq)









    



<a 1356
the 1209
of 1063
and 708
in 661
<span 594
<div 459
<li 421
class="external 336
text" 326
rel="nofollow" 323
class="reference"><a 314
<li><a 313
</tr> 288
</div> 287
– 262
<td 245
class="Z3988"><span 243
is 240
class="citation 240

various other ways for counting word frequencies here

String formatting

Python features several string formatting options.

`str.format`

non-str objects are automatically cast to str
- under the hood: the object's __format__ method is called if it exists, otherwise its __str__ is called



In [2]:

    
name = "John"
age = 25

print("My name is {0} and I'm {1} years old. I turned {1} last December".format(name, age))
print("My name is {} and I'm {} years old.".format(name, age))
# print("My name is {} and I'm {} years old. I turned {} last December".format(name, age))  # raises IndexError
print("My name is {name} and I'm {age} years old. I turned {age} last December".format(
    name=name, age=age))









    



My name is John and I'm 25 years old. I turned 25 last December
My name is John and I'm 25 years old.
My name is John and I'm 25 years old. I turned 25 last December

Format specification mini language

% operator

note that the arguments need to be parenthesized (make it a tuple)



In [3]:

    
print("My name is %s and I'm %d years old" % (name, age))









    



My name is John and I'm 25 years old

string interpolation

f-strings were added in Python 3.6 in PEP498



In [7]:

    
import sys

age2 = 12

if sys.version_info >= (3, 6):
    print(f"My name is {name} and I'm {age} years old {age2}")









    



My name is John and I'm 25 years old 12

Mandatory reading

A4 page printable PEP8 cheat sheet
introduction to character encodings by Joel Spolsky, co-founder of Stack Overflow

Reference

Official documentation of built-in types



In [ ]:

Introduction to Python and Natural Language Technologies

Type system and built-in types

Lecture 02

PEP8, the Python style guide

Type system

Dynamic type system

Strongly typed

Built-in types and operators

boolean operators

boolean type

Numeric types

Precision and range

Python3

float

Arithmetic operators

quotient operator

Python 2 vs. 3 difference

Explicit floor quotient operator

Comparison operators

operators can be chained

Other operators for numeric types

remainder

power

absolute value

round

Explicit conversions between numeric types

math and cmath

Mutable vs. immutable types

Sequence types

list

list

Advanced indexing, ranges

Q. What is the output?

elements don't need to be of the same type

lists can be traversed with for loops

enumerate

List sorting

tuple

tuple

dictionary

dictionary

removing keys

iterating dictionaries

Under the hood

Q. Can these be dictionary keys?

set

set

deleting elements

frozenset

set operations

issubset and issuperset

useful set properties

Functions with variable number of arguments

args and kwargs

Mutable default arguments

Lambda expressions

Lambda expressions

Strings

Strings

Character encodings - Unicode

Python 2 does not do this automatically

Manual string decoding in Python 2

bytes in Python 3

String operations

Binary predicates (i.e. yes-no questions)

split and join

Q. Compute the word frequencies in a Wikipedia article.

String formatting

str.format

% operator

string interpolation

Mandatory reading

Suggested reading

Reference

`math` and `cmath`

`issubset` and `issuperset`

`args` and `kwargs`

`bytes` in Python 3

`str.format`