18 CNN Interpretation with CAM - CAM and Hooks - 《The fastai book》

More precisely, at each position of our final convolutional layer, we have as many filters as in the last linear layer. We can therefore compute the dot product of those activations with the final weights to get, for each location on our feature map, the score of the feature that was used to make a decision.

We’re going to need a way to get access to the activations inside the model while it’s training. In PyTorch this can be done with a hook. Hooks are PyTorch’s equivalent of fastai’s callbacks. However, rather than allowing you to inject code into the training loop like a fastai callback, hooks allow you to inject code into the forward and backward calculations themselves. We can attach a hook to any layer of the model, and it will be executed when we compute the outputs (forward hook) or during backpropagation (backward hook). A forward hook is a function that takes three things—a module, its input, and its output—and it can perform any behavior you want. (fastai also provides a handy HookCallback that we won’t cover here, but take a look at the fastai docs; it makes working with hooks a little easier.)

To illustrate, we’ll use the same cats and dogs model we trained in <>:

In [ ]:

epoch	train_loss	valid_loss	error_rate	time
0	0.053405	0.052540	0.010825	00:19

To start, we’ll grab a cat picture and a batch of data:

In [ ]:

img = PILImage.create(image_cat())
x, = first(dls.test_dl([img]))

For CAM we want to store the activations of the last convolutional layer. We put our hook function in a class so it has a state that we can access later, and just store a copy of the output:

In [ ]:

class Hook():
    def hook_func(self, m, i, o): self.stored = o.detach().clone()

We can then instantiate a Hook and attach it to the layer we want, which is the last layer of the CNN body:

In [ ]:

hook_output = Hook()
hook = learn.model[0].register_forward_hook(hook_output.hook_func)

Now we can grab a batch and feed it through our model:

In [ ]:

with torch.no_grad(): output = learn.model.eval()(x)

In [ ]:

act = hook_output.stored[0]

Let’s also double-check our predictions:

In [ ]:

Out[ ]:

tensor([[0.0010, 0.9990]], device='cuda:0')

We know 0 (for False) is “dog,” because the classes are automatically sorted in fastai, bu we can still double-check by looking at dls.vocab:

In [ ]:

Out[ ]:

So, our model is very confident this was a picture of a cat.

To do the dot product of our weight matrix (2 by number of activations) with the activations (batch size by activations by rows by cols), we use a custom einsum:

In [ ]:

x.shape

Out[ ]:

torch.Size([1, 3, 224, 224])

In [ ]:

torch.Size([2, 7, 7])

For each image in our batch, and for each class, we get a 7×7 feature map that tells us where the activations were higher and where they were lower. This will let us see which areas of the pictures influenced the model’s decision.

For instance, we can find out which areas made the model decide this animal was a cat (note that we need to decode the input x since it’s been normalized by the DataLoader, and we need to cast to TensorImage since at the time this book is written PyTorch does not maintain types when indexing—this may be fixed by the time you are reading this):

In [ ]:

x_dec = TensorImage(dls.train.decode((x,))[0][0])
_,ax = plt.subplots()
x_dec.show(ctx=ax)
ax.imshow(cam_map[1].detach().cpu(), alpha=0.6, extent=(0,224,224,0),
              interpolation='bilinear', cmap='magma');

The areas in bright yellow correspond to high activations and the areas in purple to low activations. In this case, we can see the head and the front paw were the two main areas that made the model decide it was a picture of a cat.

Once you’re done with your hook, you should remove it as otherwise it might leak some memory:

In [ ]:

That’s why it’s usually a good idea to have the Hook class be a context manager, registering the hook when you enter it and removing it when you exit. A context manager is a Python construct that calls when the object is created in a with clause, and __exit__ at the end of the with clause. For instance, this is how Python handles the with open(...) as f: construct that you’ll often see for opening files without requiring an explicit close(f) at the end. If we define Hook as follows:

In [ ]:

class Hook():
    def __init__(self, m):
        self.hook = m.register_forward_hook(self.hook_func)   
    def hook_func(self, m, i, o): self.stored = o.detach().clone()
    def __enter__(self, *args): return self
    def __exit__(self, *args): self.hook.remove()

we can safely use it this way:

In [ ]:

with Hook(learn.model[0]) as hook:
    with torch.no_grad(): output = learn.model.eval()(x.cuda())
    act = hook.stored

fastai provides this class for you, as well as some other handy classes to make working with hooks easier.