Part 4: Federated Learning with Model Averaging

Recap: In Part 2 of this tutorial, we trained a model using a very simple version of Federated Learning. This required each data owner to trust the model owner to be able to see their gradients.

Description: In this tutorial, we'll show how to use the advanced aggregation tools from Part 3 to allow the weights to be aggregated by a trusted "secure worker" before the final resulting model is sent back to the model owner (us).

In this way, only the secure worker can see whose weights came from whom. We might be able to tell which parts of the model changed, but we do NOT know which worker (bob or alice) made which change, which creates a layer of privacy.

Authors:


In [ ]:
import torch
import syft as sy
import copy
hook = sy.TorchHook(torch)
from torch import nn, optim

Step 1: Create Data Owners

First, we're going to create two data owners (Bob and Alice) each with a small amount of data. We're also going to initialize a secure machine called "secure_worker". In practice this could be secure hardware (such as Intel's SGX) or simply a trusted intermediary.


In [ ]:
# create a couple workers

bob = sy.VirtualWorker(hook, id="bob")
alice = sy.VirtualWorker(hook, id="alice")
secure_worker = sy.VirtualWorker(hook, id="secure_worker")


# A Toy Dataset
data = torch.tensor([[0,0],[0,1],[1,0],[1,1.]], requires_grad=True)
target = torch.tensor([[0],[0],[1],[1.]], requires_grad=True)

# get pointers to training data on each worker by
# sending some training data to bob and alice
bobs_data = data[0:2].send(bob)
bobs_target = target[0:2].send(bob)

alices_data = data[2:].send(alice)
alices_target = target[2:].send(alice)

Step 2: Create Our Model

For this example, we're going to train with a simple Linear model. We can initialize it normally using PyTorch's nn.Linear constructor.


In [ ]:
# Iniitalize A Toy Model
model = nn.Linear(2,1)

Step 3: Send a Copy of the Model to Alice and Bob

Next, we need to send a copy of the current model to Alice and Bob so that they can perform steps of learning on their own datasets.


In [ ]:
bobs_model = model.copy().send(bob)
alices_model = model.copy().send(alice)

bobs_opt = optim.SGD(params=bobs_model.parameters(),lr=0.1)
alices_opt = optim.SGD(params=alices_model.parameters(),lr=0.1)

In [ ]:

Step 4: Train Bob's and Alice's Models (in parallel)

As is conventional with Federated Learning via Secure Averaging, each data owner first trains their model for several iterations locally before the models are averaged together.


In [ ]:
for i in range(10):

    # Train Bob's Model
    bobs_opt.zero_grad()
    bobs_pred = bobs_model(bobs_data)
    bobs_loss = ((bobs_pred - bobs_target)**2).sum()
    bobs_loss.backward()

    bobs_opt.step()
    bobs_loss = bobs_loss.get().data

    # Train Alice's Model
    alices_opt.zero_grad()
    alices_pred = alices_model(alices_data)
    alices_loss = ((alices_pred - alices_target)**2).sum()
    alices_loss.backward()

    alices_opt.step()
    alices_loss = alices_loss.get().data
    
    print("Bob:" + str(bobs_loss) + " Alice:" + str(alices_loss))

Step 5: Send Both Updated Models to a Secure Worker

Now that each data owner has a partially trained model, it's time to average them together in a secure way. We achieve this by instructing Alice and Bob to send their model to the secure (trusted) server.

Note that this use of our API means that each model is sent DIRECTLY to the secure_worker. We never see it.


In [ ]:
alices_model.move(secure_worker)

In [ ]:
bobs_model.move(secure_worker)

Step 6: Average the Models

Finally, the last step for this training epoch is to average Bob and Alice's trained models together and then use this to set the values for our global "model".


In [ ]:
with torch.no_grad():
    model.weight.set_(((alices_model.weight.data + bobs_model.weight.data) / 2).get())
    model.bias.set_(((alices_model.bias.data + bobs_model.bias.data) / 2).get())

Rinse and Repeat

And now we just need to iterate this multiple times!


In [ ]:
iterations = 10
worker_iters = 5

for a_iter in range(iterations):
    
    bobs_model = model.copy().send(bob)
    alices_model = model.copy().send(alice)

    bobs_opt = optim.SGD(params=bobs_model.parameters(),lr=0.1)
    alices_opt = optim.SGD(params=alices_model.parameters(),lr=0.1)

    for wi in range(worker_iters):

        # Train Bob's Model
        bobs_opt.zero_grad()
        bobs_pred = bobs_model(bobs_data)
        bobs_loss = ((bobs_pred - bobs_target)**2).sum()
        bobs_loss.backward()

        bobs_opt.step()
        bobs_loss = bobs_loss.get().data

        # Train Alice's Model
        alices_opt.zero_grad()
        alices_pred = alices_model(alices_data)
        alices_loss = ((alices_pred - alices_target)**2).sum()
        alices_loss.backward()

        alices_opt.step()
        alices_loss = alices_loss.get().data
    
    alices_model.move(secure_worker)
    bobs_model.move(secure_worker)
    with torch.no_grad():
        model.weight.set_(((alices_model.weight.data + bobs_model.weight.data) / 2).get())
        model.bias.set_(((alices_model.bias.data + bobs_model.bias.data) / 2).get())
    
    print("Bob:" + str(bobs_loss) + " Alice:" + str(alices_loss))

Lastly, we want to make sure that our resulting model learned correctly, so we'll evaluate it on a test dataset. In this toy problem, we'll use the original data, but in practice we'll want to use new data to understand how well the model generalizes to unseen examples.


In [ ]:
preds = model(data)
loss = ((preds - target) ** 2).sum()

In [ ]:
print(preds)
print(target)
print(loss.data)

In this toy example, the averaged model is underfitting relative to a plaintext model trained locally would behave, however we were able to train it without exposing each worker's training data. We were also able to aggregate the updated models from each worker on a trusted aggregator to prevent data leakage to the model owner.

In a future tutorial, we'll aim to do our trusted aggregation directly with the gradients, so that we can update the model with better gradient estimates and arrive at a stronger model.


In [ ]:

Congratulations!!! - Time to Join the Community!

Congratulations on completing this notebook tutorial! If you enjoyed this and would like to join the movement toward privacy preserving, decentralized ownership of AI and the AI supply chain (data), you can do so in the following ways!

Star PySyft on GitHub

The easiest way to help our community is just by starring the Repos! This helps raise awareness of the cool tools we're building.

Join our Slack!

The best way to keep up to date on the latest advancements is to join our community! You can do so by filling out the form at http://slack.openmined.org

Join a Code Project!

The best way to contribute to our community is to become a code contributor! At any time you can go to PySyft GitHub Issues page and filter for "Projects". This will show you all the top level Tickets giving an overview of what projects you can join! If you don't want to join a project, but you would like to do a bit of coding, you can also look for more "one off" mini-projects by searching for GitHub issues marked "good first issue".

If you don't have time to contribute to our codebase, but would still like to lend support, you can also become a Backer on our Open Collective. All donations go toward our web hosting and other community expenses such as hackathons and meetups!

OpenMined's Open Collective Page


In [ ]: