Part 1 - Introduction to Grid

Grid is a platform to train, share and manage models and datasets in a distributed, collaborative and secure way.

  Grid platform aims to be a secure peer to peer platform. It was created to use pysyft's features to perform federated learning processes without the need to manage distributed workers directly. Nowadays, to perform machine learning process with PySyft library, the user needs to manage directly all the workers' stuff (start nodes, manage node connections, turn off nodes, etc). Grid platform solves this in a transparent way. The user won't need to know about how the nodes are connected or where is some specific dataset.

Authors:

Why should we use grid?

As mentioned before, the grid is basically a platform that uses PySyft library to manage distributed workers providing some special features.

We should use grid to:

  • Train models using datasets that we've never seen (without getting access to its real values).
  • Train a model with encrypted datasets.
  • Provide Secure MLaaS running encrypted model inferences across grid network.
  • We can serve an encrypted model without giving its real weights to anyone.
  • We can run encrypted inferences without sending our private data to anyone.
  • Mitigate risks and impacts using Federated Learning's "privacy by design" property.
  • Manage the privacy level of datasets stored at grid network allowing/disallowing access to them.

How it works?

We have two concepts of grid: Private Grid Platform and Public Grid Platform

Private Grid

Private Grid is used to build private's grid platform.

  It will empower you with the control to manage the entire platform, you'll be able to create, remove and manage all nodes connected on your grid network. However, with power and control, you'll need to take care of the grid platform by yourself.

  • To build it, you'll need to know previously where is each grid node that you want to use in your infrastructure.
  • You will need to configure scale up/scale down routines (nº of nodes) by yourself.
  • You can add pr remove nodes.
  • You will be connected directly with these nodes.


In [ ]:
import syft as sy
import torch as th
from syft.grid.clients.dynamic_fl_client import DynamicFLClient

hook = sy.TorchHook(th)

In [ ]:
# How to build / use a private grid network

# 1 - Start the grid nodes.
# 2 - Connect to them directly
# 3 - Create Private Grid using their instances.

# We need to know the address of every node.
node1 = DynamicFLClient(hook, "ws://localhost:3000")
node2 = DynamicFLClient(hook, "ws://localhost:3001")
node3 = DynamicFLClient(hook, "ws://localhost:3002")
node4 = DynamicFLClient(hook, "ws://localhost:3003")

my_grid = sy.PrivateGridNetwork(node1,node2,node3,node4)

Public Grid

Public Grid offers the oportunity to work as a real collaborative platform.

 Unlike the private grid, anyone has the power to control all nodes connected to the public grid, the platform will be managed by grid gateway. This component will update the network automatically and perform queries through the nodes. It's important to note that the grid gateway can only perform non-privileged commands on grid nodes, it will avoid some vulnerabilities.

Therefore, anyone can register a new node, upload new datasets using their nodes to share it with everyone in a secure way.

Public Grid should work as a Secure Data Science platform (such as Kaggle, but using Privacy-Preserving concepts):

- We send pointers to datasets instead of real datasets.
- We can share our models across the network in an encrypted way.
- We can run inferences using our sensitive datasets without send the real value of it to anyone.


In [ ]:
# How to build/use a public grid network

# 1 - Start the grid nodes
# 2 - Register them at grid gateway component
# 3 - Use grid gateway to perform queries.

# You just need to know the address of grid gateway.
my_grid = sy.PublicGridNetwork(hook, "http://localhost:5000")

Congratulations!!! - Time to Join the Community!

Congratulations on completing this notebook tutorial! If you enjoyed this and would like to join the movement toward privacy preserving, decentralized ownership of AI and the AI supply chain (data), you can do so in the following ways!

Star PySyft on GitHub

The easiest way to help our community is just by starring the GitHub repos! This helps raise awareness of the cool tools we're building.

Join our Slack!

The best way to keep up to date on the latest advancements is to join our community! You can do so by filling out the form at http://slack.openmined.org

Join a Code Project!

The best way to contribute to our community is to become a code contributor! At any time you can go to PySyft GitHub Issues page and filter for "Projects". This will show you all the top level Tickets giving an overview of what projects you can join! If you don't want to join a project, but you would like to do a bit of coding, you can also look for more "one off" mini-projects by searching for GitHub issues marked "good first issue".

If you don't have time to contribute to our codebase, but would still like to lend support, you can also become a Backer on our Open Collective. All donations go toward our web hosting and other community expenses such as hackathons and meetups!

OpenMined's Open Collective Page