Skip to content

1

In order to accomplish our goal of figuring out which pictures were being misplaced by our embedding network, we took some time to introspect on Hong's model accuracy calculation function. In doing this we learned/ confirmed some things about how the linear algebra behind embedding works!

We saw how, by multiplying the high dimensional space vector by its transpose, will give us the distance from each vector to every other vector. We then sort to find the closest vector to each other vector. We set the prediction of the given vector to whatever the class is of the closest other vector to it.

Here is an example image that the network misclassified (raw image on left, highlighted similarities on right):

Test Image (from ~1min from video start):

Closest image in High dimensional space (from ~30min from video start):

Here are the images again with some highlighted similarities:

Test Image:

Predicted image (closest in high-dimensional space):

Another similarity we saw was the truck in the far right lane, and far left lane. They look fairly similar, and it would be reasonable for them to be this far apart within the time period of one of our classes.

The first glaring issue we saw were the trucks/ lack of trucks in the bottom left. We then thought it may be fine, because it makes sense that, within a class it is likely for new vehicles to enter the scene from this point, so the model might allow for a situation like this.

We attempted to use the heatmap, we have most of it set up, we gave it a first attempt, but we just think we are plugging in the wrong vectors as input. (we can talk to Hong/ Abby more about this). Here is our first attempt (lol):

 

Next week:

  • Find the right vectors to use for our heatmap to confirm the car-tracking ability of the model