Some intuition about CapsNet

In the past few days, I learned about SVM, IsoMap and followed the link on slack to read about CapsNet and Using Causal Effect to explain classifier.

Here is some intuition about CapsNet:

As far as I understand, CapsNet groups several neurons together so that the 'feature map' in CapsNet consists of vectors instead of scalars.

This design allows variation in certain representation in feature map. So it encourage different view of same object to be represented in the same capsule.

It also use coupling coefficient to replace max-pooling procedure in traditional CNN. (the procedure from primary caps to digit caps corresponds to global pooling)

This design encourage CapsNet explicitly encode the part-whole relationship. So that the lower level feature tends to be the spacial parts of high level feature.

The paper shows that CapsNet performs better in recognizing overlapping digits than traditional CNN on MNIST dataset.

May be CapsNet will have better performance in dataset consists of more complicated objects?

Leave a Reply Cancel reply