Copyright 2015 Enthought, Inc. All Rights Reserved

Solar power and cloudy skies

In this exercise, we would like to compare the distribution of solar power when the sky is clear and when it is cloudy.

Concepts: missing data, groupby

Part 1 - missing data

First of all, we need to deal with missing data in the dataset.

  1. Execute the code below to read the data from disk. The table contains the date and time of the measurements, the solar power in W, and a flag indicating if the sky is cloudy.
  2. Drop all the rows with missing data for the cloudy column, since we will not be able to use them in the analysis.
  3. Interpolate the missing data from the column containing the power, since it varies relatively smoothly. (Hint: look at the options for the method keyword to find an appropriate interpolation function)

In [1]:
%matplotlib inline

import pandas as pd
# Load the data and plot its columns.
solar = pd.read_csv('solar.csv', index_col=0, parse_dates=True)
solar.plot(subplots=True, figsize=(15, 5));



In [ ]:
# Your code goes here!

Part 2 - cloudy days power

  1. Group the data by the cloudy flag.
  2. Compute the mean and standard deviation of each group, in two separate commands.
  3. Create a new dataframe with a column for the mean and one for the standard deviation in a single command.

In [ ]:
# Your code goes here!