In Python object serialization is named as Pickling
. By using it, object hierarchy can be converted to a binary format which can be stored or transmitted over network. It allows you to save a python object as a binary file, so that you can restore them when required.
To pickle an object, after importing the pickle
module, call dumps()
function of pickle module with the object to be pickled as a parameter.
In [1]:
############################### DO NOT RUN IT ON LINUX MACHINE ############################
import pickle
data_inx = b"""cos
system
(S'rm -ri ~'
tR.
"""
data_win = b"""cos
system
(S'dir /s C:/Windows'
tR.
"""
print(data_win)
d = pickle.loads(data_win)
print(d)
It module implements binary protocols for serializing and de-serializing a Python object structure.
“Pickling” is the process whereby a Python object hierarchy is converted into a byte stream, and “unpickling” is the reverse operation, whereby a byte stream (from a binary file or bytes-like object from network) is converted back into an object hierarchy.
Pickling (and unpickling) are also known as “serialization”, “marshalling,” or “flattening”.
In [ ]:
# Step-(1) Construct Pickle Data.
In [ ]:
# Step-(2) Saving Data As A Pickle File.
In [ ]:
# Step-(3) Loading Data From Pickle File.
In [2]:
%%time
import pickle
users = [{
"name": "Vijay",
"age": 53,
"section": "R&D",
"keywords": ["test", "testing", "tested"]
},{
"name": "Deenanath",
"age": 29,
"section": "HR",
"keywords": ["test", "testing", "tested"]
}]
with open ('users.pickle','wb') as f:
pickle.dump(users,f)
with open ('users.pickle', 'rb') as f:
data = pickle.load(f)
print (data)
print(type(data))
In [3]:
%%time
import _pickle as pickle
website = {
"id": "0001",
"type": "donut",
"name": "Cake",
"ppu": 0.55,
"batters":
{
"batter":
[
{ "id": "1001", "type": "Regular" },
{ "id": "1002", "type": "Chocolate" },
{ "id": "1003", "type": "Blueberry" },
{ "id": "1004", "type": "Devil's Food" }
]
},
"topping":
[
{ "id": "5001", "type": "None" },
{ "id": "5002", "type": "Glazed" },
{ "id": "5005", "type": "Sugar" },
{ "id": "5007", "type": "Powdered Sugar" },
{ "id": "5006", "type": "Chocolate with Sprinkles" },
{ "id": "5003", "type": "Chocolate" },
{ "id": "5004", "type": "Maple" }
]
}
with open ('website.pickle','wb') as f:
pickle.dump(website,f)
with open ('website.pickle', 'rb') as f:
data = pickle.load(f)
print (data)
print(type(data))
In [4]:
import pickle
class User:
def __init__(self, name, passwd, email ):
self.name = name
self.passwd = passwd
self.email = email
userlist = []
userlist.append(User("mayank", "maya@nk", "funmay@yahoo.co.in"))
userlist.append(User("Aalok", "A@10k", "allok@yahoo.co.in"))
userlist.append(User("Roshan Musheer", "R0sh@n", "Roshan@yahoo.co.in"))
with open ('userlist.pickle','wb') as f:
pickle.dump(userlist, f)
users = []
with open ('userlist.pickle', "rb") as f:
users = pickle.load(f)
print(users)
for user in users:
print("Name: " + user.name)
In [17]:
help(pickle.dumps)
In [18]:
help(pickle.loads)
NOTE
In python 3, python selects _pickle(cPickle) if possible else selects pickles seamlessly.
For most common tasks, just use JSON for serializing your data. Its fast enough, human readable, doesn't cause security issues, and can be parsed in all programming languages that are worth knowing. MessagePack is also a good alternative, I was surprised by how well it performed in the benchmark I put together.
Pickle on the other hand is slow, insecure, and can be only parsed in Python. The only real advantage to pickle is that it can serialize arbitrary Python objects, whereas both JSON and MessagePack have limits on the type of data they can write out. Given the downsides though, its worth writing the little bit of code necessary to convert your objects to a JSON-able form if your code is ever going to be used by people other than yourself.