Since we’ll be concatenating the embeddings, rather than taking their dot product, the two embedding matrices can have different sizes (i.e., different numbers of latent factors). fastai has a function that returns recommended sizes for embedding matrices for your data, based on a heuristic that fast.ai has found tends to work well in practice:
In [ ]:
Out[ ]:
[(944, 74), (1635, 101)]
Let’s implement this class:
In [ ]:
In [ ]:
CollabNN
creates our Embedding
layers in the same way as previous classes in this chapter, except that we now use the embs
sizes. self.layers
is identical to the mini-neural net we created in <> for MNIST. Then, in forward
, we apply the embeddings, concatenate the results, and pass this through the mini-neural net. Finally, we apply sigmoid_range
as we have in previous models.
Let’s see if it trains:
In [ ]:
fastai provides this model in fastai.collab
if you pass use_nn=True
in your call to collab_learner
(including calling get_emb_sz
for you), and it lets you easily create more layers. For instance, here we’re creating two hidden layers, of size 100 and 50, respectively:
learn = collab_learner(dls, use_nn=True, y_range=(0, 5.5), layers=[100,50])
epoch | train_loss | valid_loss | time |
---|---|---|---|
0 | 1.002747 | 0.972392 | 00:16 |
1 | 0.926903 | 0.922348 | 00:16 |
2 | 0.877160 | 0.893401 | 00:16 |
3 | 0.838334 | 0.865040 | 00:16 |
4 | 0.781666 | 0.864936 | 00:16 |
is an object of type EmbeddingNN
. Let’s take a look at fastai’s code for this class:
In [ ]:
Wow, that’s not a lot of code! This class inherits from TabularModel
, which is where it gets all its functionality from. In __init__
it calls the same method in TabularModel
, passing n_cont=0
and out_sz=1
; other than that, it only passes along whatever arguments it received.
EmbeddingNN
includes **kwargs
as a parameter to __init__
. In Python **kwargs
in a parameter list means “put any additional keyword arguments into a dict called kwargs
. And **kwargs
in an argument list means “insert all key/value pairs in the kwargs
dict as named arguments here”. This approach is used in many popular libraries, such as matplotlib
, in which the main function simply has the signature plot(*args, **kwargs)
. The says “The kwargs
are Line2D
properties” and then lists those properties.
We’re using **kwargs
in EmbeddingNN
to avoid having to write all the arguments to TabularModel
a second time, and keep them in sync. However, this makes our API quite difficult to work with, because now Jupyter Notebook doesn’t know what parameters are available. Consequently things like tab completion of parameter names and pop-up lists of signatures won’t work.
End sidebar
Although the results of EmbeddingNN
are a bit worse than the dot product approach (which shows the power of carefully constructing an architecture for a domain), it does allow us to do something very important: we can now directly incorporate other user and movie information, date and time information, or any other information that may be relevant to the recommendation. That’s exactly what TabularModel
does. In fact, we’ve now seen that EmbeddingNN
is just a TabularModel
, with n_cont=0
and . So, we’d better spend some time learning about TabularModel
, and how to use it to get great results! We’ll do that in the next chapter.