Now that we got to know bestPy
's powerful algorithms, we cant't wait to use them, right? In trying to do so, however, we might realize that they are pretty bare-bone and inconvenient to handle. For example, we need to know the internally used integer index of a customer to get a preditiction for him/her instead of just getting a prediction for the customer ID. Likewise, we only get back an array of scores for each article and still have to search for the most highly recommended, still have to translate its index into an actual article ID, etc.
Taking all this burden off the user, who should focus on selecting and tweaking the algorithms, there is The Recommender.
We only need this because the examples folder is a subdirectory of the bestPy
package.
In [1]:
import sys
sys.path.append('../..')
In [2]:
from bestPy import RecoBasedOn, write_log_to # Additionally import RecoBasedOn
from bestPy.datastructures import Transactions
from bestPy.algorithms import Baseline, TruncatedSVD # Import Baseline and TruncatedSVD as examplary algorithm
logfile = 'logfile.txt'
write_log_to(logfile, 20)
file = 'examples_data.csv'
data = Transactions.from_csv(file)
In [3]:
recommendation = RecoBasedOn(data)
In [4]:
recommendation.algorithm
Out[4]:
This is the default algorithm.
IMPORTANT: If we wanted a different algorithm, say truncated SVD, we don't simply set it, but we call the method using()
instead, like so:
In [5]:
algorithm = TruncatedSVD()
algorithm.number_of_factors = 24
algorithm.binarize = False
recommendation = recommendation.using(algorithm)
recommendation.algorithm
Out[5]:
No need to first attach data to the algorithm. The recommender does that or us.
In [6]:
algorithm.has_data
Out[6]:
Next up is the baseline
attrribute. As maybe expected, it tells us that our Baseline
algorithm is part of The Recommender.
In [7]:
recommendation.baseline
Out[7]:
We need it in order to make recommendations also to new cutomers, who do not have a purchase history yet. As opposed to the algorithm, the baseline can be simply set as expected.
In [8]:
recommendation.baseline = Baseline()
recommendation.baseline
Out[8]:
Finally we have a set of attributes starting with keeping_old
. It tells the recommender not to filter out articles already purchased by the customer we are making a recommendation for but, on the contrary, to allow recommending them back to him/her if the algorithm says we should. To dial in this behavior of The Recommender we call the attribute in a manner similar to the using()
method.
In [9]:
recommendation = recommendation.keeping_old
If we wanted to know whether or not only new articles will be recommended (as opposed to also articles that a given customer already bought), we simply inspect the only_new
attribute.
In [10]:
recommendation.only_new
Out[10]:
Evidently, it is now False
. Finally, if we wanted to change the bahavior of The Recommender to recommending only new articles, thus discarding already bought articles, we invoke the remaing attribute pruning_old
like so:
In [11]:
recommendation = recommendation.pruning_old
recommendation.only_new
Out[11]:
NOTE: You may wonder why the method using()
and the attributes keeping_old
as well as pruning_old
are called in a somewhat odd fashion. The idea behind this is that you can chain all these calls together in a single, elegant line of code that almost reads like a sentence in natural language.
In [12]:
recommendation = RecoBasedOn(data).using(algorithm).pruning_old
And that's it with the parameters.
Surely you have already realized that also The Recommender has a for_one()
method, just like our algorithms. Indeed, it also provides recommendations for a given customer but, this time, in a much more convenient form. Specifically, it
More specifically, it returns a python
generator, which needs to be consumed to actually access the recommended article IDs, like so:
In [13]:
customer = '4' # Now a string ID
top_six = recommendation.for_one(customer, 6)
for article in top_six:
print(article)
And, voilà, your recommendation. Again, obvious misuse, like asking for more recommendations than there are articles, is discretely corrected. Try, for instance, the following request
In [14]:
all_articles = recommendation.for_one(customer, 8300)
and all you get is an entry in the logfile.
[WARNING ]: Requested 8300 recommendations but only 8255 available. Returning all 8255. (recommender| ...
Thanks to the baseline, handling new customers is no problem.
In [15]:
newbie = 'new customer'
top_three = recommendation.for_one(newbie, 3)
for article in top_three:
print(article)
Provided you set the logging level to 20 (meaning INFO), you will be notified of this feat with the message:
[INFO ]: Unknown target user. Defaulting to baseline recommendation. (recommender|__cold_start)
It is important to note that, if you wanted to change the parameters of the algorithm and get a new recommendation based on these new parameters, you can simply do so. You do not need to instantiate a new RecoBasedOn
object and neither to you need to re-attach the changed algorithm to an existing RecoBasedOn
instance. Nothing of that sort. Simply do
In [16]:
algorithm.number_of_factors = 10
top_six = recommendation.for_one(customer, 6)
for article in top_six:
print(article)
and witness how the recommended articles changed. This then concludes our presentation of The Recommender.
In [ ]: