Problems with the current design:

if, when training disag algos, we iterate through appliances then we'll end up training on the same meters labelled as different appliances
when we ask for a power series for an appliance, I think it's not sufficient to give a warning that that meter measures multiple appliances; instead we should return something like a PowerStrip object.

>>> electric['toaster'] PowerStrip(dominant=Appliance(type='toaster', instance=1), others=[Appliance(type='sandwich maker', instance=1)]) >>> electric['fridge'] Appliance(type='fridge', instance=1) or PowerStrip(dominant=Appliance(type='fridge', instance=1)) the latter feels better because then we can simplify life by iterating through all powerstrips and, on each iteration, only getting powerstrip.dominant. The PowerStrip class would contain Appliances. But what if there is no dominant appliance (UK-DALE will be the only dataset with this information). >>> electric['fridge'] PowerStrip(dominant=None, others=[Appliance('fridge')]) How do we handle functions like on_duration? Different appliances on the powerstrip could have different on_power_thresholds, and so could be distinguishable, but this will rarely be the case. So maybe functions like on_duration should just take the minimum on_power_threshold from all appliances and return a single on_duration for the entire powerstrip? How do give a PowerStrip an ID? I suppose each appliance no longer needs a dataset and a building? e.g.: PowerStrip(dataset=REDD, building=1, appliances=[(fridge,1), (kettle,1)], dominant=(fridge,1)) OR... get rid of PowerStrip idea. Instead have get_leaf_meters. And we handle dual-supply at the meter level (i.e. a single meter object with two dataframes, one for each physical meter... which slightly breaks things). Meter(dataset=REDD, building=1, appliances=[(fridge,1), (kettle,1)], dominant=(fridge,1)) That is more simple but perhaps feels a little conceptually "dirty". Or maybe not. then, do we ask the meter for its on_duration, activity_distribution, proportion_of_energy etc? I guess that's OK. It means the Appliance class becomes quite simple (it's basically just a wrapper around a dict of metadata). What about when we select based on metadata? I guess we change AppliancGroup to MeterGroup? (and hence select meters; based I guess on either the dominant appliance (if there is one) or on the "average". What if there isn't an "average"? (i.e. all appliances are in a different group)? Then don't add that meter? Warn? OK, so here's the new design: * Appliance gets a lot of responsibility taken away and put into EMeter * EMeter will have a subclass for EMeterDualSupply * EMeter will own a list of appliances

old notes:

ApplianceGroup.disaggregated_leaves() need to think of a better name. distinct leaves? Leaf meters? Separable appliances? Separate appliances? Submetered appliances? Returns list of ApplianceGroups, one per meter or dual supply appliance. Or maybe that's over complex. Basically just need downstream meters except dual supply appliances where we need the two meters together. Maybe I need a PowerStrip class to aggregate multiple appliances under one meter? No, don't think I need a new class... All the information we need is encoded in the structures we have, we just need to make sure we process and package it up appropriately. Specifically, we must not try to train a disaggregation algo to learn appliances which we say are separate but are actually the same meter. And ApplianceGroup needs to only count each meter in the group once.



In [ ]: