This Notebook is admittedly a little bit weird in terms of the topics it mixes. We bring in a large number of dinosuar names, in the sense of species, as discovered from the fossil record, and perhaps from other records. However this set of strings only serves to fuel the purely mathematical process of performing
hashlib.sha256 magic on each one.
Think of dino names as passwords. We may consider these insecure but lets not assume the game is that serious. For the purposes of today's exercise, they're secure enough.
However, just because you've picked a password does not mean a DBA needs to keep it in her database, where it might get stolen. Rather, a hash of your password serves as a deterministic fingerprint. Just save the fingerprint. No one with a wrong password will get through the gate.
In [ ]:import pandas as pd import matplotlib.pyplot as plt dinos = pd.read_json("dino_hash.json")
You'll notice the hashing algorithm has already been applied by the time we import this JSON data. I'll be showing you the Python source code for scripts that aren't Jupyter Notebook scripts, for that part of the pipeline.
In [ ]:dinos.head()
Remember how the .loc attribute uses enhanced slice notation ("enhanced" in the sense core Python does not support it).
In [ ]:dinos.loc["Mo":"N"]
In [ ]:dinos.dtypes
In [ ]:dinos.index.is_unique
In [ ]:code = dinos.loc['Mtapaiasaurus']
In [ ]:code
In [ ]:len(code)
In [ ]:dinos.info()
In [ ]:int(code, base=16)
In [ ]:0xafe4c2b017ed3996bf5f4f3b937f0ae22e649df2f620787e136ed6bd3ea32e2d
In [ ]:
In [ ]:
How about in descending order?
In [ ]: