In [ ]:
epochs = 50

Part 7 - Federated Learning with FederatedDataset

Here we introduce a new tool for using federated datasets. We have created a FederatedDataset class which is intended to be used like the PyTorch Dataset class, and is given to a federated data loader FederatedDataLoader which will iterate on it in a federated fashion.

Authors:

We use the sandbox that we discovered last lesson


In [ ]:
import torch as th
import syft as sy
sy.create_sandbox(globals(), verbose=False)

Then search for a dataset


In [ ]:
boston_data = grid.search("#boston", "#data")
boston_target = grid.search("#boston", "#target")

We load a model and an optimizer


In [ ]:
n_features = boston_data['alice'][0].shape[1]
n_targets = 1

model = th.nn.Linear(n_features, n_targets)

Here we cast the data fetched in a FederatedDataset. See the workers which hold part of the data.


In [ ]:
# Cast the result in BaseDatasets
datasets = []
for worker in boston_data.keys():
    dataset = sy.BaseDataset(boston_data[worker][0], boston_target[worker][0])
    datasets.append(dataset)

# Build the FederatedDataset object
dataset = sy.FederatedDataset(datasets)
print(dataset.workers)
optimizers = {}
for worker in dataset.workers:
    optimizers[worker] = th.optim.Adam(params=model.parameters(),lr=1e-2)

We put it in a FederatedDataLoader and specify options


In [ ]:
train_loader = sy.FederatedDataLoader(dataset, batch_size=32, shuffle=False, drop_last=False)

And finally we iterate over epochs. You can see how similar this is compared to pure and local PyTorch training!


In [ ]:
for epoch in range(1, epochs + 1):
    loss_accum = 0
    for batch_idx, (data, target) in enumerate(train_loader):
        model.send(data.location)
        
        optimizer = optimizers[data.location.id]
        optimizer.zero_grad()
        pred = model(data)
        loss = ((pred.view(-1) - target)**2).mean()
        loss.backward()
        optimizer.step()
        
        model.get()
        loss = loss.get()
        
        loss_accum += float(loss)
        
        if batch_idx % 8 == 0:
            print('Train Epoch: {} [{}/{} ({:.0f}%)]\tBatch loss: {:.6f}'.format(
                epoch, batch_idx, len(train_loader),
                       100. * batch_idx / len(train_loader), loss.item()))            
            
    print('Total loss', loss_accum)

Congratulations!!! - Time to Join the Community!

Congratulations on completing this notebook tutorial! If you enjoyed this and would like to join the movement toward privacy preserving, decentralized ownership of AI and the AI supply chain (data), you can do so in the following ways!

Star PySyft on GitHub

The easiest way to help our community is just by starring the Repos! This helps raise awareness of the cool tools we're building.

Join our Slack!

The best way to keep up to date on the latest advancements is to join our community! You can do so by filling out the form at http://slack.openmined.org

Join a Code Project!

The best way to contribute to our community is to become a code contributor! At any time you can go to PySyft GitHub Issues page and filter for "Projects". This will show you all the top level Tickets giving an overview of what projects you can join! If you don't want to join a project, but you would like to do a bit of coding, you can also look for more "one off" mini-projects by searching for GitHub issues marked "good first issue".

If you don't have time to contribute to our codebase, but would still like to lend support, you can also become a Backer on our Open Collective. All donations go toward our web hosting and other community expenses such as hackathons and meetups!

OpenMined's Open Collective Page


In [ ]: