Description of Datasets Available in pods

Open Data Science Initiative

4th June 2014 Neil D. Lawrence

The GPy toolbox provides easy access to various data sets. The data sets are accessed through calls to the GPy.util.datasets module. This module contains functions which download and preprocess the data for you, presenting it in a dictionary with a standard format for use in GPy. On download you will also be informed of any licensing restrictions and relevant citations for the data. The data is then cached on your local drive.

Here are some descriptions of the data sets in different notebooks.

Count Data

Regression Data

  • pods.datasets.airline_delay() airline delay data set used for the Gaussian Processes for Big Data Paper by Hensman, Fusi & Lawrence.
  • pods.datasets.mauna_loa() download the latest version of the Mauna Loa data set. This is data giving carbon dioxide concentrations from the Mauna Loa observatory in Hawaii.

Other Data

Adding a Data Set


In [ ]: