IO和正则表达式其实不搭。这里只是为了学习的安排把它们放在了一起。
- open
- read/readline/readlines
- write
- close
- try...finally方法关闭文件
- with...as...语法关闭文件
- file-like object: 含有read()方法的返回对象,都称为file-like ojbect。StringIO,connected sockets等。
- 二进制文件打开需要使用“*b*”模式:open("/home/longshan/hello.txt", "rb")
- python3默认是使用UTF-8格式的编码。如果需要读写其他的编码格式的文件,给open()/write()函数传递encoding参数:open("/home/longshan/hello.txt", encoding="gbk")
- StringIO:在内存中读写str
- BytesIO: 在内存中读写二进制数据。'中文'.encode('utf-8)
- write()/getvalue()/readline()等方法
- 主要时os/os.path模块中的函数的使用
- pickling/unpickling: dumps&dump/load
- shiyongjson进行序列化
- 正则表达式还是挺复杂的,需要的时候再仔细研究
- python中的re模块用来做正则匹配
- complie->match->groups:compile可以提高效率;match失败返回None;groups()返回所有的分组。group(0)返回原始字符串,group(1)返回第一个括号匹配
In [12]:
try:
f = open('/home/longshan/hello.txt', 'r')
print(f.read())
finally:
if f:
f.close()
with open('/home/longshan/hello.txt', 'r') as f:
print("Longshan")
print(f.read())
class FileLikeObject(object):
def read():
print('Hello, This is a file-like object!')
In [15]:
from io import StringIO
f = StringIO('Hello\nHi\nGoodbye')
while True:
s = f.readline()
if s == '':
break
print(s.strip())
In [17]:
from io import BytesIO
#f = BytesIO()
#f.write('中文'.encode('utf-8'))
f = BytesIO('中文'.encode('utf-8'))
f.read()
Out[17]:
In [37]:
import os
os.name
os.uname()
os.environ
os.path.abspath('.')
os.path.join('/home', 'longshan')
os.path.split('/home/longshan/hello.txt')
os.path.splitext('/home/longshan/hello')
[x for x in os.listdir('/') if os.path.isdir(os.path.join('/', x))]
Out[37]:
In [42]:
import json
class Student(object):
def __init__(self, name, age, score):
self.name = name
self.age = age
self.score = score
def student2dict(std):
return{
'name': std.name,
'age': std.age,
'score': std.score
}
def dict2student(d):
return Student(d['name'], d['age'], d['score'])
s = Student('Longshan', 29, 89)
print(json.dumps(s, default=student2dict))
json_str = '{"name": "Longshan DU", "age": 25, "score": 95}'
print(json.loads(json_str, object_hook=dict2student))
In [46]:
import re
print(re.match(r'^(\d+?)(0*)$', '102300').groups())
print(re.match(r'^(\d+?)(0*)$', '102300').group(0))
print(re.match(r'^(\d+?)(0*)$', '102300').group(1))
print(re.match(r'^(\d+?)(0*)$', '102300').group(2))
In [ ]: