This experiment aims to measure whether a embedding method have a good generalization ability by yoke-tsne.
The basic idea of this experiment is trying to find the clustering effect of same category in training embedding and test embedding. In this experiment, we split Stanford Cars dataset in dataset A(random 98 categories) and dataset B(resting 98 categories). And, train Resnet-50 by N-pair loss on A, and get embedding points of those data in A. Second, train Resnet-50 by N-pair loss on dataset B, and using this trained model to find embedding points of data in dataset A. Finally, compare those two embedding effect by yoke-tsne.
The result is as following:
The left figure is the embedding result of dataset A as training data, and the right figure is the embedding result of dataset A as testing data. As we can see, the cluster in left figure is tight while the cluster in right part is looser. In spite of this, the points in left part was clustered into group, which means the generalization ability of N-Pair loss is not bad.
Next step, I want to try some embedding methods which are considered as bad 'generalization ability' to validate whether yoke-t-sne is a good tool to measure generalization ability.