In [1]:
import bson
from bson.json_util import loads, dumps
from bson.binary import Binary
import pysodium
from bsonsearch import bsoncompare
bc = bsoncompare()

You want to allow access to some data, but perhaps not all data in a particular document.

You must start by generating a public/private key pair on the secure side, and exposing only the public key to untrusted users.


In [2]:
PUBLIC_KEY = Binary('\x8eB\x11\xd5ht\x93\x05\xee\xed\x10\xad\xb4\x90\xb7]\x92\x04\xac\x82\xb5\xa2"v\xf9[\xd6^\x14\x8b\x12\x1d', 0)

In [3]:
sensitive_subdocument = {"sensitive":"triple pinky swear"}
document = {
            "_id":1,
            "name":"bsonsearch",
            "super_secret_data":sensitive_subdocument
           }
print document


{'_id': 1, 'name': 'bsonsearch', 'super_secret_data': {'sensitive': 'triple pinky swear'}}

In [4]:
plaintext_spec = {"super_secret_data.sensitive":"triple pinky swear"}
plaintext_matc = bc.generate_matcher(plaintext_spec)
print "Matches", bc.match(plaintext_matc, document)


Matches True

but maybe you don't want that tripple pinky swear value available to just anyone to search on

so we're going to append the sensitve subdocument as an encrypted blob using assymentric encryption


In [5]:
secure_sensitive_subdocument = Binary(pysodium.crypto_box_seal(bson.BSON.encode(sensitive_subdocument), 
                                                        PUBLIC_KEY))
secure_document = {
                    "_id":1,
                    "name":"bsonsearch",
                    "super_secret_data": secure_sensitive_subdocument
                  }
print secure_document


{'_id': 1, 'name': 'bsonsearch', 'super_secret_data': Binary('\xb2\x9e ?\xc6S\xb1\x94\x1a\xb3CPO=\xab\xda\x8e\xc8\xe8\xbb-\nT\xe1\x0c1\xd61z\xd5\xc2p:4?S\x10\xd0\xb8\x0b\xb8b\x9e,\nuwCyQ\x00M;\xa0\x16\xfb\xbf\x8b\x8b\x7f%\xa0\xf9\xae\x15!\x9f]\xa9\x88\xc6\x1a{\x1c\xbd\xf1y\x8c\xf7=\xd2\x1f\xf79\x1a\xe2\x1a', 0)}

this document will no longer match the original


In [6]:
print "Matches", bc.match(plaintext_matc, secure_document)


Matches False

now we need the secret key,

The person generating the document didn't need the secret key,

but the person reading/querying the document does.

The query uses the $sealOpen command and provides the keys and a query to the inside document

notice the change in query, where the namespace is broken between the entry key and the $query

the namespace "secure_secret_data" is accounted for in the $sealOpen command, and the query acts on the value of that namespace.


In [7]:
SECRET_KEY = Binary('\xe2\x16\x9a,\xb1\x9b\xb4\xf67\xe9\xf8\x83\x0f"_\xa8}t\xd2i:\xbb\xfd\xb5\x8a\x89X.\x1b\x13\x92Z', 0)



spec_decrypt = {"super_secret_data":{"$sealOpen":{"$keys":{"pk":PUBLIC_KEY,
                                                           "sk":SECRET_KEY},
                                                  "$query":{"sensitive":"triple pinky swear"}
                                                 }
                                    }
               }

In [8]:
matcher_decrypt = bc.generate_matcher(spec_decrypt)
print "Decrypt with key matches ---->", bc.match(matcher_decrypt, secure_document)
print "Non-decrypt should be false ->", bc.match(plaintext_matc, secure_document)


Decrypt with key matches ----> True
Non-decrypt should be false -> False

Appendix - Generating Keys


In [9]:
SECURE_KEYPAIR = pysodium.crypto_box_keypair()
SECURE_KEYPAIR_DICT = {"pk":Binary(SECURE_KEYPAIR[0]),
                       "sk":Binary(SECURE_KEYPAIR[1])}
SECURE_KEYPAIR_DICT


Out[9]:
{'pk': Binary('\xc8\xaf\x96\xaf1\xed$\xc0!\x8cO\xb7\xb8\x97\x0bS\x93J\xf6\x17\xc3\xc646k i\xcar\xb5\x9b.', 0),
 'sk': Binary('XW)\xe9\xb3P\x1c\xf9\xdb\x1aW\x81\x9cx\t\xa1\xc0iY<\x98Q\xc8\xcf\x84\x8f\x84\x9f\xf7\x9f\xd9\x89', 0)}

In [9]: