python async programming

非同步編程在python中最近是越來越受歡迎，在python中有著許多libraries是用來做非同步的，其中之一是asyncio而且這也是讓python在async編程受歡迎的主因，在開始正題前，我們先來理解一些歷史緣由。

在普遍的程式，執行順序都是一行一行執行，每次要繼續往下執行前，都會等著上一行完成，也就是俗稱的Sequential programming，那麼這樣的編程可能會遇到什麼問題呢？最大的問題就是如果上一行執行太久的話，我一定要等上一行執行完我才能夠繼續往下走嗎？最常見的情況就是api request，得到回傳結果，我才能繼續往下走，但是其實我下面接著要做的並不用等這個結果就可以執行了，所以就會耗費無意義的時間，為了解決這樣的事情，會使用thread。

process 可以產生多個 thread，可以讓你的程式一次做很多事情，把它想成影分身，主體只有一個，但是你的分身卻可以同時幫你做其他事情。

上面這張圖，說明了什麼？帥！哈哈，其實是想表達，鳴人自己(process)，開出了很多分身(thread)，每個分身都做不同的事情

方便吧！但是 thread 是有他的問題存在的，其中像是

race condition
dead lock
resource starvavtion

先撇除上面會遇到的問題，thread還有著一個成本就是cpu的context switch，因為一顆cpu一次只能run一個thread，它實際上背後用很快的速度在進行thread的交換並執行，這就是所謂的context switch。那麼會有既可以達到多工的效果，又可以免除遇到上述的race condition等等問題的技術存在嗎？！答案是有的，那就是今天我們要講的主題 python async io，ayncio背後其實是用到coroutine的概念實作，從wiki上面來看，其實coroutine就是一種可以中斷及繼續執行函式呼叫的技術，直接從下面的例子來看！



In [4]:

    
import time

def n_hello():
    for i in range(6):
        print(i)
    
    
def c_hello():
    for i in range(4):
        print('in function {}'.format(i))
        yield i

def infinit_loop():
    num = 0
    while True:
        num += 1
        print(num)
        yield

    
n_hello()
print("=====")
c = c_hello()
next(c)
print("come back to main")
next(c)
next(c)









    



0
1
2
3
4
5
=====
in function 0
come back to main
in function 1
in function 2






    Out[4]:





2

上面就是python最基本支援coroutine的使用方式，第一個function n_hello 是一般的for loop版本的印出數字，另外一個function c_hello 是使用yield，藉此讓你看看兩者行為，明顯的感受出使用yield可以將程式的執行順序從subroutine轉回到main，繼續呼叫next又可以跳回去subroutine。

有沒有覺得跟multi thread很像呢，基本上是行為是差不多的，但是coroutine是基於中斷函式，繼續執行其他函式的方式來達到多工，並不像multi thread，會有同時兩個thread執行同份程式碼的問題，進而造成前面所說的，race condition, dead lock.. 那些問題，前面使用鳴人的影分身來比喻multi thread，對於coroutine，我個人想要使用下面這張來比喻

影子模仿術，鹿丸放出多條影子(coroutine)，藉由自己的大腦來控制所有人的行動。

那麼接著再稍微深入看看yield的使用方式，前面使用方式是yield把值從function傳出去，那麼我們今天可以把值從外面傳到function裡面使用嗎？答案是可以的！以下看看例子



In [5]:

    
def g(x):
    for i in range(x):
        yield i
    
def will_cause_exception():
    x = yield
    print("wow {}".format(x))
    return x


def infinite_send():
    while True:
        x = yield
        print("send {}".format(x))
        


w = will_cause_exception()
next(w)
try:
    w.send(5)
except StopIteration as e:
    
    return_value = e.value
    # the function return value will be store in the exception's value
    print(return_value)
    # let you see exception
    raise e









    



wow 5
5






    



---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
<ipython-input-5-777b6cd91ff2> in <module>()
     26     print(return_value)
     27     # let you see exception
---> 28     raise e
     29 

<ipython-input-5-777b6cd91ff2> in <module>()
     19 next(w)
     20 try:
---> 21     w.send(5)
     22 except StopIteration as e:
     23 

StopIteration: 5

根據上面的使用情境，你應該會覺得多多少少可以有更方便的用法才對，因此python的確在pep380有提出yield from這個語法糖



In [6]:

    
def test_yield_from():
    w = will_cause_exception()
    value = yield from w
    print("no exception {}".format(value))
    yield
    
t = test_yield_from()
next(t)
t.send(10)



def amazing_yeild_from(x):
    yield from range(x)
    yield from range(x-1, -1, -1)
    
print(list(amazing_yeild_from(5)))









    



wow 10
no exception 10
[0, 1, 2, 3, 4, 4, 3, 2, 1, 0]



In [8]:

    
%%time
import asyncio
import requests

@asyncio.coroutine
def aio_requests(url):
    r = requests.get(url)
    return r

@asyncio.coroutine
def aio_response(response):
    data = response.text
    return data

urls = ['http://www.google.com', 'http://www.yandex.ru',
        'http://www.python.org', 'http://www.python.org', 'http://www.python.org']

@asyncio.coroutine
def call_url(url):
    response = yield from aio_requests(url)
    data = yield from aio_response(response)
    print('{}: {} bytes'.format(url, len(data)))
    return data

futures = [call_url(url) for url in urls]

loop = asyncio.get_event_loop()
loop.run_until_complete(asyncio.wait(futures))









    



http://www.python.org: 48866 bytes
http://www.python.org: 48866 bytes
http://www.python.org: 48866 bytes
http://www.yandex.ru: 83153 bytes
http://www.google.com: 11184 bytes
CPU times: user 111 ms, sys: 15.9 ms, total: 127 ms
Wall time: 3.98 s



In [9]:

    
%%time
def syn_call_url(url):
    r = requests.get(url)
    data = r.text
    print('{}: {} bytes'.format(url, len(data)))
    
for url in urls:
    syn_call_url(url)









    



http://www.google.com: 11144 bytes
http://www.yandex.ru: 82882 bytes
http://www.python.org: 48866 bytes
http://www.python.org: 48866 bytes
http://www.python.org: 48866 bytes
CPU times: user 117 ms, sys: 17.2 ms, total: 134 ms
Wall time: 4.02 s



In [ ]:

    
%%time
async def async_requests(url):
    r = requests.get(url)
    return r

async def async_response(response):
    data = response.text
    return data

async def call_url(url):
    response = await async_requests(url)
    data = await async_response(response)
    print('{}: {} bytes'.format(url, len(data)))
    return data

futures = [call_url(url) for url in urls]

loop = asyncio.get_event_loop()
loop.run_until_complete(asyncio.wait(futures))

syntax sugar

@asyncio.coroutine => async
yield from => await



In [ ]:

asyncio vs thread

asyncio 神秘在哪？讓我們來瞧瞧

https://www.reddit.com/r/learnpython/comments/5qwm5h/asyncio_for_dummies/dd432ke/

golang 沒有 reentrant lock



In [5]:

    
%%time

import time

time.sleep(1)









    



CPU times: user 713 µs, sys: 1.22 ms, total: 1.94 ms
Wall time: 1 s

wall time才是真的process耗費的時間 cpu times只是純粹程式在cpu裡面實際在跑的時間

python async io ，是基於event loop來進行coroutine切換，

python async programming

syntax sugar

asyncio vs thread

References