Skip to content

Last week, I find a problem of UMAP which is that if the high graph of a embedding representation is not connected, like NPair result on CAR training dataset, the optimizer of UMAP will keep push each cluster far away, which doesn't matter in visualization, but in TUMAP, we need to measure the loss of each map.

So, we try some different way to avoid or solve this problem.

First, we compute KL distance of normal UMAP with TUMAP result instead of compare their loss.

Second, we try to optimize the repulsive gradient of edge in the high graph instead of each connection of every two points. But the result of this method gonna wired.

And, I try add a 0 vector into the high dimensional vectors, and make it equally very far from each points when constructing high-D graph. It doesn't work.

After running a bunch of heuristic search of leaf finding then calculate the leaf length, the result looks like this:

There's too much noise, to reduce the noise and find the values that close the the true values. The Kalman filter is applied. The result after the Kalman filter:

The next step is processing all the results and uploading them to the betydb

UMAP paper: https://arxiv.org/abs/1802.03426

Here are some attempt based on a python module, umap-learn

First, we try UMAP on some Gaussian dataset.

(1): We generate two different Gaussian distribution dataset (1000*64,1000*64) in different location (mean), and visualize it by a)random pick two dimension b) umap, c) tsne.

(2): We generate two different Gaussian distribution dataset (1000*64,1000*64) in different scale (std), and visualize it by a)random pick two dimension b) umap, c) tsne.

(2): We generate two different Gaussian distribution dataset (1000*64,1000*64) in different location(mean) and scale (std), and visualize it by a)random pick two dimension b) umap, c) tsne.

 

Then, we try to compare result of t-SNE and UMAP on embedding result of npair and epshn, on CAR dataset.

Npair on training data:

Npair val data:

EPSHN tra data:

EPSHN val data:

Here is the result:

Some details of training as a note:

Res50, Triplet, Large image, from both sensor, (depth, depth, reflection) as channels.

The recall by plot is pretty low as I thought. Since from the all-plot tSNE, it is kind of mass. But I think the plots do go somewhere. When we do tSNE on chosen several plots, some do separated. So they do go somewhere by something. Also there are some noise variables (wind for example) that we doesn't care may affect the apperance of leaves. So I think for this, we need to find a more specific metric to inspect and qualify the model. Maybe something like:

  • Linear dependence between embedding space and ground truth measurement.
  • Cluster distribution and variance.

I'm also trying to train one that the dimension of embedding space is 2. So that we may show the ability of the network directly on 2D space to see wheather there have something interesting.

Plot Meaning Embedding

I'm also building the network to embedding the abstract plot using the images and date. Some question arised when I implement it.

  • Should it be trained with RGB images or depth?
  • Which network structure/ How deep should it use as the image feature extractor?
  • When training it, should the feature extractor be freezed or not?

My initial plan for this is using the RGB since the networks are pretrainied on RGB image. And use 3-4 layers of the res50 without freezing.

I made a visualization of the leaf length/width pipeline for the 3D scanner Data.

Raw data first (part of):

Then is the cropping:

With the connected component, we got 6000+ regions. Then with the heuristic search:

Then is the leaf length and width for each single region. The blue lines are the paths for leaf length, the orange lines are leaf width. The green dots are key points on leaf length path for the leaf width. Those key points are calculated by equally separate the weighted length path as 6 parts. The width with zero means it did not find any good width path

For the leaf width paths that are still on the same side, I'm going to restrict more on the cosine distance instead of only positive cross product: