Skip to content

3

This experiment aims to measure whether a embedding method have a good generalization ability  by yoke-tsne.

The basic idea of this experiment is trying to find the clustering effect of same category in training embedding and test embedding. In this experiment, we split Stanford Cars dataset in dataset A(random 98 categories) and dataset B(resting 98 categories). And, train Resnet-50 by N-pair loss on A, and get embedding points of those data in A. Second, train Resnet-50 by N-pair loss on dataset B, and using this trained model to find embedding points of data in dataset A. Finally, compare those two embedding effect by yoke-tsne.

 

The result is as following:

The left figure is the embedding result of dataset A as training data, and the right figure is the embedding result of dataset A as testing data. As we can see, the cluster in left figure is tight while the cluster in right part is looser. In spite of this, the points in left part was clustered into group, which means the generalization ability of N-Pair loss is not bad.

Next step, I want to try some embedding methods which are considered as bad 'generalization ability' to validate whether yoke-t-sne is a good tool to measure generalization ability.

In last week, we try our yoke t-sne method (add a L2 distance term into t-sne loss function). In this week, we try different scales of this L2 distance term to see the effect of t-sne.

This loss function of t-sne is that:

C = KL distance 1(embed 1 with t-sne1) + KL distance 2 + Ⲗ * (t-sne1 - t-sne2)^2

In this measurement, with lambda change, we record the ratio between KL distance in yoke t-sne and KL distance in original t-sne, and record the L2 distance.

The result is as following:

Its the ratio of for the first embedding.

Its the ratio of for the second embedding.

It this alignment error (the L2 distance)

 

As we can see in the above figures, we the weight of the L2 distance term increase, the ratio increase, which imply that when we 'yoke' heavier the t-sne, the distribution of t-sne plane is less like the distribution in high embedding plane. And, the decreasing alignment error shows that the two t-sne is align more perfect with lambda increasing.

3

As we known, t-sne focus on local relationship. In order to comparing two embedding result, we try to align those t-sne cluster into same place, which is intuitive to compare them.

The basic idea is adding a L2 distance term to align two t-sne embedding to together.

Here is the result:

Following is the original t-sne for two different embedding methods:

Following is the yoke t-sne for those two embedding result: