pandas.DataFrame
has a fillna
function which is highly useful for as a compliment to outer joins. gurobipy
doesn't play as nicely as it could with this function, because it won't allow a Var
to be the argument to fillna
.
This is particularly relevant here because gurobipy
also has the toehold problem (discussed in the notebook in the parent directory to this one).
In [1]:
from gurobipy import *
import pandas as pd
m = Model()
v = m.addVar()
m.update()
df = pd.DataFrame({"a":[1, 2, 3],
"b":[4, 5, 6],
"data":[1, 12, 19]})
df.set_index(["a", "b"], inplace=True)
df2 = pd.DataFrame({"a":[1, 2, 5], "b" : [4, 5, 2], "data":[1.1, 12.1, 19]})
df2.set_index(["a", "b"], inplace=True)
Ok, set up finished, here is the buggy step.
In [2]:
df.join(df2, how="outer", rsuffix="_2").fillna(v)
I'm inclined to think of this as something gurboipy
can fix. Note how I can combine fillna
with new style classes just fine. (I.e. see below).
In [3]:
class MyClass(object):
def __repr__(self):
return "<I made this class>"
mc = MyClass()
In [4]:
df.join(df2, how="outer", rsuffix="_2").fillna(mc)
Out[4]:
However, the most likely usage pattern is where I want not a variable but instead a gurobipy
friendly proxy for zero. Luckily, that does work!
In [5]:
df.join(df2, how="outer", rsuffix="_2").fillna(quicksum([]))
Out[5]:
This is the workaround I use in the netflowpandasmodel.py
file in this directory.
It also leads to the more general workaround - which is to use a linear expression equivalent to the variable itself. I.e. you can always call fillna(2*v-v)
instead of fillna(v)
.
In [6]:
df.join(df2, how="outer", rsuffix="_2").fillna(2*v-v)
Out[6]: