1.17 从字典中提取子集

有构造的字典他是另外一个字典的子集

最简单使用字典推导
通过创建一个元组序列后将之传至dict()func 实现



In [1]:

    
prices = {
    'ACME': 45.23,
    'AAPL': 612.78,
    'IBM': 205.55,
    'HPQ': 37.20,
    'FB': 10.75
}
# Make a dictionary of all prices over 200
p1 = {key: value for key, value in prices.items() if value > 200}
# Make a dictionary of tech stocks
tech_names = {'AAPL', 'IBM', 'HPQ', 'MSFT'}
p2 = {key: value for key, value in prices.items() if key in tech_names}



In [2]:

    
print(p1,'\n',p2)









    



{'AAPL': 612.78, 'IBM': 205.55} 
 {'HPQ': 37.2, 'AAPL': 612.78, 'IBM': 205.55}

大多数情况下字典推导能做到的，可通过创建一个元组sequence 然后将之传至 dict() func 也可



In [4]:

    
p3 = dict((key, value) for key, value in prices.items() if value > 200)
print(p3)









    



{'AAPL': 612.78, 'IBM': 205.55}

but 字典推导表达意思更加清晰同时运行速度更快(近一倍)
同时第二个例子程序可重写



In [8]:

    
# Make a dictionary of tech stocks
tech_names = {'AAPL', 'IBM', 'HPQ', 'MSFT'}
p4 = {key: prices[key] for key in prices.keys() & tech_names}
print(p4)









    



{'HPQ': 37.2, 'AAPL': 612.78, 'IBM': 205.55}



In [10]:

    
# p4 = {key: prices[key] for key in prices.keys() and tech_names}

上述两行推导 -- '&' 与 'and' 区别：

一个是位运算 num1 & num2 等同于 bin(num1) & bin(num2)
一个是逻辑运算 num1 and num2 <<if num1 is False => False, else => num2>>[and 与只要有一个假即假]

在这里 & 的作用是匹配 prices.keys() 与 tech_names 中所含元素相匹配的结合以下描述,虽然 prices.keys() tech_names 两者类型but



In [11]:

    
prices.keys()









    Out[11]:





dict_keys(['ACME', 'HPQ', 'AAPL', 'IBM', 'FB'])



In [12]:

    
tech_names









    Out[12]:





{'AAPL', 'HPQ', 'IBM', 'MSFT'}



In [16]:

    
type(prices.keys()) == type(tech_names)









    Out[16]:





False



In [17]:

    
prices.keys() & tech_names









    Out[17]:





{'AAPL', 'HPQ', 'IBM'}



In [18]:

    
type(prices.keys() & tech_names)









    Out[18]:





set

1.18 映射名称到序列元素

将下标访问的 list or tuple 中的元素
想转化成通过名称访问的元素

利用collections.nameetuple() func 来使用一个普通的tuple帮助解决。
实际上一个返回python 中标准tuple 类型的子类的工厂方法Factory Method
需传递一个类型名与所需字段后其返回一个类 and 你可以初始化此类 and 为你定义的字段传递值



In [1]:

    
from collections import namedtuple
Subsciber = namedtuple('Subscriber',['addr','joined'])
sub = Subsciber('jonesy@exit.com','2012-10-19')
sub









    Out[1]:





Subscriber(addr='jonesy@exit.com', joined='2012-10-19')



In [2]:

    
sub.addr









    Out[2]:





'jonesy@exit.com'



In [6]:

    
sub.joined









    Out[6]:





'2012-10-19'

nametuple 实例看起来像一个普通的类实例 and 其跟元组类型可交换 and 支持所有普通元组操作如索引 and 解压



In [7]:

    
len(sub)









    Out[7]:





2



In [9]:

    
addr, joined = sub
print(addr,'\n',joined)









    



jonesy@exit.com 
 2012-10-19

namedtuple 's 主要用途即将你的代码从下标操作解脱出来 and if 从数据库中调用中返回了很大的元组列表 and 通过通过下标去操作其中元素但你当表中添加了新的列的时候你的代码就会出错 if 使用 namedtuple 即不会

使用普通 tuple 的代码



In [11]:

    
def compute_cost(records):
    total = 0.0
    for rec in records:
        total += rec[1] * rec[2]
    return total

下标操作通常会让代码表意不明并且非常依赖记录records 的结构
if 出现歧义



In [12]:

    
Stock = namedtuple('Stock',['name','shares','price'])
def compute_cost2(records):
    total = 0.0
    for rec in records:
        s = Stock(*rec)
    total += s.shares * s.price
    return total

namedtuple 另一个用途是作为dict 的替代 because dict 存储需要更多的内存空间 and 需要构建一个非常大的包含字典的数据结构 and 使用命名元组会更加高效 BUT 不像dict 一个namedtuple 是不可更改



In [13]:

    
s = Stock('Ace',100,98.9)
s









    Out[13]:





Stock(name='Ace', shares=100, price=98.9)



In [14]:

    
s.shares









    Out[14]:





100



In [15]:

    
s.shares = 98









    



---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-15-876dc6160f87> in <module>()
----> 1 s.shares = 98

AttributeError: can't set attribute

以上不能使用 s.shares = 98
if want to change the attr 可以使用namedtuple instance 's _replace() and 其会创建一个全新的namedtuple and 将对应字段用新的值取代



In [16]:

    
s2 = s._replace(shares=98)



In [17]:

    
print(s,'\n',s2)









    



Stock(name='Ace', shares=100, price=98.9) 
 Stock(name='Ace', shares=98, price=98.9)

_replace() method and 有用特性 is 当你namedtuple 拥有可选或缺失字段时，他是个超级方便填充数据的方法可以先创建一个内含默认值的原型(初态)tuple and 使用_replace() 创建新值被更新过的instance



In [21]:

    
# Create a ST type
ST = namedtuple('ST',['name','share','price','date','time'])
# Create a prototype instance
ST_prototype = ST('', 0, 0.0, None, None)
# Function to convert a dictionary to a ST
def dict_to_ST(s):
    return ST_prototype._replace(**s)



In [22]:

    
a = ('hi',1,12,'2016-09-10','18:19:18')
dict_to_ST(a)









    



---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-22-4bf31e4e3306> in <module>()
      1 a = ('hi',1,12,'2016-09-10','18:19:18')
----> 2 dict_to_ST(a)

<ipython-input-21-f7d172f5ca01> in dict_to_ST(s)
      5 # Function to convert a dictionary to a ST
      6 def dict_to_ST(s):
----> 7     return ST_prototype._replace(**s)

TypeError: _replace() argument after ** must be a mapping, not tuple



In [23]:

    
a = {'name':'hi','share':1,'price':12,'date':'2016-09-10','time':'18:19:18'}
dict_to_ST(a)









    Out[23]:





ST(name='hi', share=1, price=12, date='2016-09-10', time='18:19:18')

if you目标是一个需要更新很多instance's attr 高效的数据结构 BUT namedtuple is not 最佳选择 YOU can 使用一个包含 _slots_ method 的类
REF: chp 8.4

1.19 转换并同时计算数据

需要在data sequence 上执行聚集func (sum(),min(),max()) BUT 首先需要转换数据或者过滤数据

结合数据计算与转换使用一个生成器表达式参数



In [1]:

    
# want to 平方和
nums = [1,2,3,4,5,6]
s = sum(x * x for x in nums)



In [2]:

    
s









    Out[2]:





91



In [8]:

    
# Determine if any .py files exist in a directory
# 判断 python 文件是否存在此目录中
# 只要有一个py file 存在 any() return True
import os
files = os.listdir('f:\Save\python')
if any(name.endswith('.py') for name in files):
    print('There be python file!')
else:
    print('Sorry no python.')
# Output a tuple as CSV
s = ('ACME',50,123.34)
print(','.join(str(x) for x in s))
# Data reduction across fileds of a data structure
portfolio = [
    {'name':'GOOG','share':50},
    {'name':'Yahoo','share':75},
    {'name':'ALO','share':20},
    {'name':'CSX','share':85}
]
min_share = min(s['share'] for s in portfolio)









    



There be python file!
ACME,50,123.34



In [7]:

    
min_share









    Out[7]:





20

上述是将生成器表达式作为一个单独 argument 传递给func 时候的巧妙语法 (不需要多加一个括号) 加不加括号两者等效



In [10]:

    
s = sum((x * x for x in nums)) # 显示的传递一个生成器表达式对象
s = sum(x * x for x in nums) # 更加优雅的实现方式 省略了括号

使用一个生成器表达式作为 argument 会比先创建一个临时列表更加高效 and 优雅



In [11]:

    
s = sum([x * x for x in nums])
s









    Out[11]:





91

以上多创建临时列表速度会变慢即将会创建一个巨大的仅仅被使用一次就被丢弃的临时数据结构!!!!!!



In [12]:

    
# Odiginal : Return 20
min_s1 = min(s['share'] for s in portfolio)
# Alternative : Return ['name':'AOL,'share':20]
min_s2 = min(portfolio, key=lambda s:s['share'])
print(min_s1,'\n',min_s2)









    



20 
 {'name': 'ALO', 'share': 20}

1.17 从字典中提取子集

有构造的字典 他是另外一个字典的子集

大多数情况下 字典推导能做到的，可通过创建一个元组sequence 然后将之传至 dict() func 也可

but 字典推导表达意思更加清晰 同时 运行速度更快(近一倍)同时 第二个例子程序 可重写

上述两行推导 -- '&' 与 'and' 区别：

1.18 映射名称到序列元素

将下标访问的 list or tuple 中的元素想转化成通过名称访问的元素

利用collections.nameetuple() func 来使用一个普通的tuple帮助 解决。实际上 一个返回python 中标准tuple 类型的子类的工厂方法Factory Method需传递一个类型名与所需字段 后其返回一个类 and 你可以初始化 此类 and 为你定义的字段传递值

nametuple 实例 看起来像 一个普通的类实例 and 其跟元组类型 可交换 and 支持所有普通元组操作 如 索引 and 解压

namedtuple 's 主要用途 即 将你的代码从下标 操作解脱出来 and if 从数据库中调用中返回了很大的元组列表 and 通过通过 下标 去操作其中元素 但你 当表中添加了新的列的时候 你的代码就会出错 if 使用 namedtuple 即不会

使用普通 tuple 的代码

下标操作通常会让代码表意不明 并且非常依赖记录records 的结构if 出现 歧义

namedtuple 另一个用途 是作为dict 的替代 because dict 存储需要更多的内存空间 and 需要构建一个非常大的包含字典的数据结构 and 使用命名元组会更加高效 BUT 不像dict 一个namedtuple 是不可更改

以上不能使用 s.shares = 98 if want to change the attr 可以使用namedtuple instance 's _replace() and 其会创建一个全新的namedtuple and 将对应字段用新的值取代

_replace() method and 有用特性 is 当你namedtuple 拥有可选或缺失字段时， 他是个超级方便填充数据的方法 可以先创建一个内含默认值 的原型(初态)tuple and 使用_replace() 创建新值被更新过的instance

if you目标是一个需要更新很多instance's attr 高效的数据结构 BUT namedtuple is not 最佳选择 YOU can 使用一个包含 _slots_ method 的类REF: chp 8.4

1.19 转换并同时计算数据

需要在data sequence 上执行聚集func (sum(),min(),max()) BUT 首先需要转换数据或者过滤数据

结合数据计算与转换 使用一个生成器表达式 参数

上述是将 生成器表达式 作为一个单独 argument 传递给func 时候 的巧妙语法 (不需要多加一个括号) 加不加括号 两者等效

使用一个生成器表达式作为 argument 会比先创建一个临时列表更加高效 and 优雅

以上多创建临时列表 速度会变慢 即将会创建一个巨大的仅仅被使用一次就被丢弃的临时数据结构!!!!!!

有构造的字典他是另外一个字典的子集

大多数情况下字典推导能做到的，可通过创建一个元组sequence 然后将之传至 dict() func 也可

but 字典推导表达意思更加清晰同时运行速度更快(近一倍)
同时第二个例子程序可重写

将下标访问的 list or tuple 中的元素
想转化成通过名称访问的元素

利用collections.nameetuple() func 来使用一个普通的tuple帮助解决。
实际上一个返回python 中标准tuple 类型的子类的工厂方法Factory Method
需传递一个类型名与所需字段后其返回一个类 and 你可以初始化此类 and 为你定义的字段传递值

nametuple 实例看起来像一个普通的类实例 and 其跟元组类型可交换 and 支持所有普通元组操作如索引 and 解压

namedtuple 's 主要用途即将你的代码从下标操作解脱出来 and if 从数据库中调用中返回了很大的元组列表 and 通过通过下标去操作其中元素但你当表中添加了新的列的时候你的代码就会出错 if 使用 namedtuple 即不会

下标操作通常会让代码表意不明并且非常依赖记录records 的结构
if 出现歧义

namedtuple 另一个用途是作为dict 的替代 because dict 存储需要更多的内存空间 and 需要构建一个非常大的包含字典的数据结构 and 使用命名元组会更加高效 BUT 不像dict 一个namedtuple 是不可更改

以上不能使用 s.shares = 98
if want to change the attr 可以使用namedtuple instance 's _replace() and 其会创建一个全新的namedtuple and 将对应字段用新的值取代

_replace() method and 有用特性 is 当你namedtuple 拥有可选或缺失字段时，他是个超级方便填充数据的方法可以先创建一个内含默认值的原型(初态)tuple and 使用_replace() 创建新值被更新过的instance

if you目标是一个需要更新很多instance's attr 高效的数据结构 BUT namedtuple is not 最佳选择 YOU can 使用一个包含 _slots_ method 的类
REF: chp 8.4

结合数据计算与转换使用一个生成器表达式参数

上述是将生成器表达式作为一个单独 argument 传递给func 时候的巧妙语法 (不需要多加一个括号) 加不加括号两者等效

以上多创建临时列表速度会变慢即将会创建一个巨大的仅仅被使用一次就被丢弃的临时数据结构!!!!!!