Ch4 Writing Structured Programs

Assignments



In [33]:

    
a = list('hello')  # a指向一個list物件
b = a              # b指向a所指向的list物件
b[3] = 'x'         # 改變物件第3個元素，因為實際件只有一個，所以a,b看到的物件會同時改變
a, b









    Out[33]:





(['h', 'e', 'l', 'x', 'o'], ['h', 'e', 'l', 'x', 'o'])



In [36]:

    
a = ['maybe']
b = [a, a, a]
b









    Out[36]:





[['maybe'], ['maybe'], ['maybe']]



In [37]:

    
a[0] = 'will'
b









    Out[37]:





[['will'], ['will'], ['will']]

注意: 如果你要複製list，必須用[:]來複製，否則只會複製指標。



In [45]:

    
a = ['play']
b = a[:]
a[0] = 'zero'
a, b









    Out[45]:





(['zero'], ['play'])



In [48]:

    
a = ['play']
b = [a, a]
a[0] = 'what'
a, b, id(a), id(b[0])









    Out[48]:





(['what'], [['what'], ['what']], 65567816L, 65567816L)

Equality

用==是比較兩個元素值是否相同。
用is是比較兩個元素是否參考同一個物件。



In [52]:

    
a is b[0], a is b[1]









    Out[52]:





(True, True)



In [54]:

    
b = a[:]
# 因為用複製的，所以值相同但物件不同
a is b, a == b









    Out[54]:





(False, True)

Conditions

將list放在if中，會直接判斷list是否為空，相當於if len(list) > 0:。



In [55]:

    
e = []
if e: print e, " is not empty"



In [62]:

    
e = []
if not e: print e, " is empty"









    



[]  is empty

any()判斷一個list是否存在True的元素，all()判斷一個list是否全為True，in用來判斷值是否存在list中。



In [65]:

    
a = [0, 1, 2, 3, 4, 5]
any(a), all(a), 3 in a, 8 in a









    Out[65]:





(True, False, True, False)

Sequences

sequence最常用的操作是用for訪問每一個元素。



In [75]:

    
a = [3, 3, 2, 4, 1]
[item for item in a]  # 原始順序









    Out[75]:





[3, 3, 2, 4, 1]



In [76]:

    
[item for item in sorted(a)]  # 排序









    Out[76]:





[1, 2, 3, 3, 4]



In [77]:

    
[item for item in set(a)]  # 只考慮唯一的元素









    Out[77]:





[1, 2, 3, 4]



In [78]:

    
[item for item in reversed(a)]  # 倒序









    Out[78]:





[1, 4, 2, 3, 3]



In [79]:

    
[item for item in set(a).difference([3,4])]  # 不要某些元素









    Out[79]:





[1, 2]



In [88]:

    
import random
random.shuffle(a)  # shuffle後，會直接影響a內部的值
[item for item in a]









    Out[88]:





[4, 2, 3, 1, 3]



In [90]:

    
''.join(['hello', 'world'])  # join可以將字串連在一起









    Out[90]:





'helloworld'

利用tuple可以同時進行多個元素的取代。



In [93]:

    
a = [1, 2, 3, 4, 5]
(a[2], a[3], a[4]) = (5, 6, 7)
a









    Out[93]:





[1, 2, 5, 6, 7]

用zip可以將多個list結合成tuple。



In [97]:

    
a = range(5)
b = range(5, 10)
zip(a, b, a, b)









    Out[97]:





[(0, 5, 0, 5), (1, 6, 1, 6), (2, 7, 2, 7), (3, 8, 3, 8), (4, 9, 4, 9)]



In [101]:

    
list(enumerate(b))  # enumerate 會傳回 (index, a[index])









    Out[101]:





[(0, 5), (1, 6), (2, 7), (3, 8), (4, 9)]



In [105]:

    
a = [5, 3, 2, 4, 1]
a.sort()  # .sort() 會直接修改原始list
a









    Out[105]:





[1, 2, 3, 4, 5]



In [108]:

    
a = [5, 3, 2, 4, 1]
sorted(a), a  # 用sorted()不會影響原始list









    Out[108]:





([1, 2, 3, 4, 5], [5, 3, 2, 4, 1])

重覆元素的方法



In [109]:

    
'hello' * 3









    Out[109]:





'hellohellohello'



In [110]:

    
['hello'] * 3









    Out[110]:





['hello', 'hello', 'hello']



In [111]:

    
[['a'] * 3] * 2









    Out[111]:





[['a', 'a', 'a'], ['a', 'a', 'a']]

Function Inputs and Outputs

在設計function時，要注意，如果會修改輸入參數，最好不要有輸出，否則會讓使用者混淆。

def sort1(a):   # OK, 會修改輸入但沒有輸出
    a.sort()
def sort2(a):   # OK, 不會修改輸入, 有輸出
    return sorted(a)
def sort3(a):   # BAD, 有修改輸入又有輸出, 一定會有人搞錯
    a.sort()
    return a

所有function的參數都是call-by-value，但要注意，如果參數是一個list，list傳入的value是物件id，傳到function內部後變成可修改的list。



In [116]:

    
def func1(a):
    a[0] = 'modified'
s = ['hello', 'world']
func1(s)
s









    Out[116]:





['modified', 'world']

Variable Scope

Python遵守LGB Rule，先找local，再找global，再找built-in。

function可以透過global關鍵字創造global變數，但實際上越少用越好，這會影響function的可用性。

Check Variable Type

一般用assert(cond)配合isinstance來完成。assert當參數為False時，會出現AssertionError。



In [122]:

    
a = 'hello'
assert(isinstance(a, basestring))  # 沒問題



In [123]:

    
a = 3
assert(isinstance(a, basestring))  # 錯誤









    



---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-123-e9fc08b19c2d> in <module>()
      1 a = 3
----> 2 assert(isinstance(a, basestring))  # 錯誤

AssertionError:

Documenting Function



In [124]:

    
def hello(a):
    """
    This is a hello function.
    
    The only function is print hello world.
    
    @param a: a string to be printed
    @type a: C{basestring}
    @rtype: C{float}
    """
    print 'hello world', a
    return(3.14)

hello('my dear')









    



hello world my dear






    Out[124]:





3.14



In [126]:

    
print hello.__doc__









    



    This is a hello function.
    
    The only function is print hello world.
    
    @param a: a string to be printed
    @type a: C{basestring}
    @rtype: C{float}

Lambda Expression

lambda是用來產生臨時性function的方法。



In [129]:

    
z = lambda w: w**2
z(5)









    Out[129]:





25

Named Arguments



In [131]:

    
def generic(*a, **b):
    print a   # 集中所有 unnamed arguments
    print b   # 集中所有 names arguments

generic(1, 3.5, 'money', zzz='maybe', ggg='good')









    



(1, 3.5, 'money')
{'ggg': 'good', 'zzz': 'maybe'}



In [134]:

    
def func(*a, z):
    print a, z   # 因為有指定 *a 收集所有 unnamed arguments，造成 z 出錯
func('hi', 'this')









    



  File "<ipython-input-134-6cdcb28d14ce>", line 1
    def func(*a, z):
                 ^
SyntaxError: invalid syntax

Structure of a Python Module



In [137]:

    
nltk.corpus.__file__









    Out[137]:





'C:\\Users\\banyh_000\\Anaconda2\\lib\\site-packages\\nltk\\corpus\\__init__.pyc'



In [143]:

    
help(nltk.bigrams)









    



Help on function bigrams in module nltk.util:

bigrams(sequence, **kwargs)
    Return the bigrams generated from a sequence of items, as an iterator.
    For example:
    
        >>> from nltk.util import bigrams
        >>> list(bigrams([1,2,3,4,5]))
        [(1, 2), (2, 3), (3, 4), (4, 5)]
    
    Use bigrams for a list version of this function.
    
    :param sequence: the source data to be converted into bigrams
    :type sequence: sequence or iter
    :rtype: iter(tuple)

Letter Trie



In [146]:

    
def insert(trie, key, value):
    if key:
        first, rest = key[0], key[1:]
        if first not in trie:
            trie[first] = {}  # empty dict
        insert(trie[first], rest, value)  # key[1:] is new key
    else:
        trie['value'] = value
        
trie = nltk.defaultdict(dict)



In [151]:

    
insert(trie, 'chat', 100)
insert(trie, 'chair', 2000)
insert(trie, 'chien', 150)
trie









    Out[151]:





defaultdict(dict,
            {'c': {'h': {'a': {'i': {'r': {'value': 2000}},
                't': {'value': 100}},
               'i': {'e': {'n': {'value': 150}}}}}})



In [153]:

    
trie['c']['h']['a']['t']['value']









    Out[153]:





100

Matplotlib



In [154]:

    
%matplotlib inline
import matplotlib



In [157]:

    
from matplotlib import pylab



In [1]:

    
import nltk



In [ ]:

    
nltk.ngrams()