First of all, I haven't visualize my data so it is a no figure post.
In last few days, I was trying to find how KL distance between high embedding affect lambda selection. And I just run several of experiments.
I get three original tsne and nine yoked tsne for different lambda for following datas.
Proxy to NPAIRS, on TEST data, for CARS
Proxy to NPAIRS, on TEST data, for CUB
Proxy to NPAIRS, on TRAIN data, for CARS
Proxy to NPAIRS, on TRAIN data, for CUB
I have get result yet so I don't have a conclusion right now.
Second things, I modify tsne module in sklearn. Now, we can use BHt-sne on our yoked method. The time cost from 30 mins decreases to 10 mins for 8000 points.