podsThe GPy toolbox provides easy access to various data sets. The data sets are accessed through calls to the GPy.util.datasets module. This module contains functions which download and preprocess the data for you, presenting it in a dictionary with a standard format for use in GPy. On download you will also be informed of any licensing restrictions and relevant citations for the data. The data is then cached on your local drive.
Here are some descriptions of the data sets in different notebooks.
pods.datasets.google_trends() download data from Google trends. Based on an original notebook by sahuguet.pods.datasets.airline_delay() airline delay data set used for the Gaussian Processes for Big Data Paper by Hensman, Fusi & Lawrence.pods.datasets.mauna_loa() download the latest version of the Mauna Loa data set. This is data giving carbon dioxide concentrations from the Mauna Loa observatory in Hawaii.pods.datasets.spellman_yeast() download the yeast cell cycle gene expression data set from Spellman et al (1998).pods.datasets.lee_yeast() download the yeast connectivity data from Lee et al (2002).pods.datasets.ceres() celestial observations of the dwarf planet Ceres by Giuseppe Piazzi.
In [ ]: