CHAPTER 5

The Recommender

Now that we got to know bestPy's powerful algorithms, we cant't wait to use them, right? In trying to do so, however, we might realize that they are pretty bare-bone and inconvenient to handle. For example, we need to know the internally used integer index of a customer to get a preditiction for him/her instead of just getting a prediction for the customer ID. Likewise, we only get back an array of scores for each article and still have to search for the most highly recommended, still have to translate its index into an actual article ID, etc.

Taking all this burden off the user, who should focus on selecting and tweaking the algorithms, there is The Recommender.

Preliminaries

We only need this because the examples folder is a subdirectory of the bestPy package.


In [1]:
import sys
sys.path.append('../..')

Imports, logging, and data

On top of the basics, we still import the Baseline and the TruncatedSVD algorithm as an example, but now focus on The Recommender, which is accessible in the top-level package as RecoBasedOn.


In [2]:
from bestPy import RecoBasedOn, write_log_to  # Additionally import RecoBasedOn
from bestPy.datastructures import Transactions
from bestPy.algorithms import Baseline, TruncatedSVD  # Import Baseline and TruncatedSVD as examplary algorithm

logfile = 'logfile.txt'
write_log_to(logfile, 20)

file = 'examples_data.csv'
data = Transactions.from_csv(file)

Creating a new RecoBasedOn object

We will see different ways of doing this further down but, for now, all we need is data in the form of a Transactions instance.


In [3]:
recommendation = RecoBasedOn(data)

Parameters of The Recommender object

Inspecting the new recommendation object with Tab completion reveals an algorithm attribute as the first entry.


In [4]:
recommendation.algorithm


Out[4]:
'CollaborativeFiltering'

This is the default algorithm.

IMPORTANT: If we wanted a different algorithm, say truncated SVD, we don't simply set it, but we call the method using() instead, like so:


In [5]:
algorithm = TruncatedSVD()
algorithm.number_of_factors = 24
algorithm.binarize = False

recommendation = recommendation.using(algorithm)
recommendation.algorithm


Out[5]:
'TruncatedSVD'

No need to first attach data to the algorithm. The recommender does that or us.


In [6]:
algorithm.has_data


Out[6]:
True

Next up is the baseline attrribute. As maybe expected, it tells us that our Baseline algorithm is part of The Recommender.


In [7]:
recommendation.baseline


Out[7]:
'Baseline'

We need it in order to make recommendations also to new cutomers, who do not have a purchase history yet. As opposed to the algorithm, the baseline can be simply set as expected.


In [8]:
recommendation.baseline = Baseline()
recommendation.baseline


Out[8]:
'Baseline'

Finally we have a set of attributes starting with keeping_old. It tells the recommender not to filter out articles already purchased by the customer we are making a recommendation for but, on the contrary, to allow recommending them back to him/her if the algorithm says we should. To dial in this behavior of The Recommender we call the attribute in a manner similar to the using() method.


In [9]:
recommendation = recommendation.keeping_old

If we wanted to know whether or not only new articles will be recommended (as opposed to also articles that a given customer already bought), we simply inspect the only_new attribute.


In [10]:
recommendation.only_new


Out[10]:
False

Evidently, it is now False. Finally, if we wanted to change the bahavior of The Recommender to recommending only new articles, thus discarding already bought articles, we invoke the remaing attribute pruning_old like so:


In [11]:
recommendation = recommendation.pruning_old
recommendation.only_new


Out[11]:
True

NOTE: You may wonder why the method using() and the attributes keeping_old as well as pruning_old are called in a somewhat odd fashion. The idea behind this is that you can chain all these calls together in a single, elegant line of code that almost reads like a sentence in natural language.


In [12]:
recommendation = RecoBasedOn(data).using(algorithm).pruning_old

And that's it with the parameters.

Making a recommendation for a target customer

Surely you have already realized that also The Recommender has a for_one() method, just like our algorithms. Indeed, it also provides recommendations for a given customer but, this time, in a much more convenient form. Specifically, it

  • accepts a cutomer ID rather than the internally used integer index as argument;
  • sorts the articles by their score and returns only the top-most hits;
  • allows us to specify how many of these we want to have;
  • returns actual article IDs rather that just their internally used integer indices.

More specifically, it returns a python generator, which needs to be consumed to actually access the recommended article IDs, like so:


In [13]:
customer = '4'  # Now a string ID

top_six = recommendation.for_one(customer, 6)
for article in top_six:
    print(article)


JI388SP87HBCANID-41358
AP082EL35CPWALID-1764
BL152EL67KCUALID-6832
SA848EL50XMNANID-34082
BL232EL84TPFANID-31224
CA189EL42IJPALID-5657

And, voilà, your recommendation. Again, obvious misuse, like asking for more recommendations than there are articles, is discretely corrected. Try, for instance, the following request


In [14]:
all_articles = recommendation.for_one(customer, 8300)

and all you get is an entry in the logfile.

[WARNING ]: Requested 8300 recommendations but only 8255 available. Returning all 8255. (recommender| ...

Thanks to the baseline, handling new customers is no problem.


In [15]:
newbie = 'new customer'

top_three = recommendation.for_one(newbie, 3)
for article in top_three:
    print(article)


KI593EL69ASKANID-36520
NE739EL06ORLANID-27491
KI593EL68ASLANID-36521

Provided you set the logging level to 20 (meaning INFO), you will be notified of this feat with the message:

[INFO    ]: Unknown target user. Defaulting to baseline recommendation. (recommender|__cold_start)

Tweaking the algorithm

It is important to note that, if you wanted to change the parameters of the algorithm and get a new recommendation based on these new parameters, you can simply do so. You do not need to instantiate a new RecoBasedOn object and neither to you need to re-attach the changed algorithm to an existing RecoBasedOn instance. Nothing of that sort. Simply do


In [16]:
algorithm.number_of_factors = 10

top_six = recommendation.for_one(customer, 6)
for article in top_six:
    print(article)


SO888EL87TSYANID-31328
AP082EL25AQGANID-36440
KI593EL68ASLANID-36521
SO888EL88TSXANID-31327
KI593EL69ASKANID-36520
CA189EL61JZEALID-6738

and witness how the recommended articles changed. This then concludes our presentation of The Recommender.


In [ ]: