Skip to content

Read ahead for a video! Game changer...

In my last post, I mentioned that my current error function and method of finding "lit" centroids was set up in a way that did not make the total error 0 when the camera and light locations were correct. This was due to the fact that I was finding centroids that were near the light, thus causing the error to never be 0. In an attempt to better understand if this kind of error was the cause of poor optimization results, I did the quick & dirty method of forcing the surface normals of centroids deemed to be "lit" to be such that those centroids would actually reflect rays from the camera directly to the light position instead of bouncing them to some area around the light position.

Forcing all "lit" centroids

The error in the dot product of the surface normals when the camera & light are int he correct locations is 0, which is what we want. Then, I fixed the light location at many different locations, and for each location, optimized for the camera location - the following movie shows a plot of the light/camera locations for each light location on a grid, colored by the final error achieved by the optimization function. This video demonstrates this:

light-camera-animation_forced (Converted)-2mpzu0k

For this I get the following results:

minimum error = 2.8328e-6
maximum error = 0.0072
best light location (min error) = [120, 30, 55]
best camera location (min error) = [95.5, 78.8, 110.5]

Trying to optimize for both using this method

If I just force these surface normals, and then try to optimize for both the camera & light, it finds both locations beautifully (as it should), with an error of 2.2204e-16, finding the locations to be:

light location = [115, 30, 57.499]
camera location = [100, 80, 110]

So, this tells us that there is a fundamental problem with the was we are defining what centroids are 'lit', a problem which can be I think avoided by looking at the image of the glitter taken when a point light source is shone on it. This way we can find the 'lit' pieces without defining such a threshold of 'angular difference in surface normals'. The down-side to this, we are getting closer and closer to our original method of optimization, and subsequently, calibration...

2

Last week I decided to pursue the optimization route to try and find the light & camera locations simultaneously. This post will focus on the progress and results thus far in the 3D optimization simulation!

In my simulation, there are 10,000 centroids all arranged in a grid on a plane (the pink shaded plane in the image below. There is a camera (denoted by the black dot) and a light (denoted by the red dot). I generate a random screen map - a list of positions on the monitor (blue shaded plane) such that a position on the monitor corresponds to a centroid. I use this screen map and the centroid locations to calculate the actual surface normals of each centroid - we will refer to these as the ground truth normals.

Then, I assume that all of the centroids are reflecting the point light (red dot), and calculate the surface normals of the centroids under this assumption - we will refer to these as the calculated normals. The centroids which are considered to be "lit" are those whose ground truth normals are very close to their calculated normals (using the dot product and finding all centroids whose normals are within ~2 degrees of each other - dot product > 0.999).

 

 

 

 

 

 

 

 

This visualization shows the centroids which are "lit" by the light and the rays from those centroids to their corresponding screen map location. As expected, all of these centroids have screen map locations which are very close to the light.

To optimize, I initialize my camera and light locations to something reasonable, and then minimize my error function.

Error Function

In each iteration of the optimization, I have some current best camera location and current best light location. Using these two locations, I can calculate the surface normals of each lit centroid - call these calculated normals. I then take the dot product of the ground truth normals and these calculated normals, and take the sum over all centroids. Since these normals are normalized, I know each centroid's dot product can contribute no more than 1 to the final sum. So, I minimize the function:

numCentroids - sum(dot(ground truth normals, calculated normals))

Results

No pictures because my current visualizations don't do this justice - I need to work on figuring out better ways to visualize this after the optimization is done running/while the optimization is happening (as a movie or something).

Initial Light: [80 50 50]
Initial Camera: [50 60 80]

Final Light: [95.839 80.2176 104.0960]
Final Camera: [118.3882 26.4220 61.7301]

Actual Light: [110 30 57.5]
Actual Camera: [100 80 110]

Final Error: 0.0031
Error in Lit Centroids: 0.0033

Discussion/Next Steps

1. Sometimes the light and camera locations get flipped in the optimization - this is to be expected because right now there is nothing constraining which is which. Is there something I can add to my error function to actually constrain this, ideally using only the surface normals of the centroids?

2. The optimization still seems to do less well than I would want/expect it to. It is possible that there is a local min that it is falling into and stopping at, so this is something I need to look at more.

3. It is unclear how much the accuracy (or lack thereof) affects the error. I want to try to perturb the ground truth surface normals by some small amount (pretend like we know there is some amount of error in the surface normals, which in real life there probably is), and then see how the optimization does. I'm not entirely sure what the best way to do this is, and I also am not sure how to go about measuring this.

Earlier this week, Dr. Pless, Abby and I worked through some of the derivations of the ellipse equations in order to better understand where our understanding of them may have gone awry. In doing so, I now believe there is a problem with the fact that we are not considering the direction of the surface normals of the glitter. It seems to me that this is the reason for the need for centroids to be scattered around the light/camera - this is how the direction of surface normals is seen by the equations; however, this is not realistically how our system is set up (we tend to have centroids all on one side of the light/camera).

I did the test of generating an ellipse, finding points on that ellipse, and then using my method of solving the linear equations to try and re-create the ellipse using just the points on the ellipse and their surface normals (which I know because I can find the gradient at any point on the ellipse, and the surface normal is the negative of the gradient).

Here are some examples of success when I generate an ellipse, find points on that ellipse (as well as their surface normals), and then try to re-create the ellipse:

Generated Ellipse with 8 points on it

 

Re-created ellipse, using the same 8 points as those in the above image.
Two generated ellipses, with 9 points total
Two re-created ellipses using all 9 points from the above image

Funny story - the issue of using all points on one side of the ellipse and it not working doesn't seem to be an issue anymore. Not sure whether this is a good thing or not yet...look for updates in future posts about this!

We decided to try and tackle the optimization approach and try to get that up and running for now, both in 2D and 3D. I am optimizing for both the light and the camera locations simultaneously.

Error Function

In each iteration of the optimization, I am computing the surface normals of each centroid using the current values of the light and camera locations. I then want to maximize the sum of the dot products of the actual surface normals and the calculated surface normals at each iteration. Since I am using a minimization function, I want to minimize the negative of the sum of dot products.

error = -1*sum(dot(GT_surf_norms, calc_surf_norms, 2));

2D Results

Initial Locations: [15, 1, 15, 3] (light_x, light_y, camera_x, camera_y)
I chose this initialization because I wanted to ensure that both the light & camera start on the correct side of the glitter, and are generally one above the other, since I know that this is their general orientation with respect to the glitter.

Final Locations: [25, 30, 25, 10]
Actual Locations: [25, 10, 25, 30]
Final Error: 10

So, the camera and light got flipped here, which sensibly could happen because there is nothing in the error function to ensure that they don't get flipped like that.

3D Results

Coming soon!

Other things I am thinking about in the coming days/week

  1. I am not completely abandoning the beautiful ellipse equations that I have been working with. I am going to take some time to analyze the linear equations. One thing I will try to understand is whether there is variation in just 1 axis (what we want) or more than 1 axis (which would tell us that there is ambiguity in the data I am using when trying to define an ellipse).
  2. After I finish writing up the optimization simulations in both 2D and 3D, I will also try to analyze the effect that some noise in the surface normals may have on the results of the optimization.

In this post, I am going to run through everything that I have tried thus far, and show some pictures of the results that each attempt has rendered. I will first discuss the results of the experiments involving a set of vertically co-linear glitter, 10 centroids. Then, I will discuss the results of the experiments involving 5 pieces of glitter placed in a circle around the camera & light, such that the camera & light are vertically co-linear.

Calculation of Coefficients

In order to account for the surface normals having varying magnitudes (information that is necessary in determining which ellipse a piece of glitter lies on), I use the ratios of the surface normal's components and the ratios of the gradient's components (see previous post for derivation).

 

Once I have constructed the matrix A as follows:
A = [-2*SN_y * x, SN_x *x - SN_y *y, 2*SN_x *y, -SN_y, SN_x], derived by expanding the equality of ratios and putting the equation in matrix form. So, we are solving the equation: Ax = 0, where x = [a, b, c, d, e]. In order to solve such a homogeneous equation, I am simply finding the null space of A, and then using some linear combination of these vectors to get my coefficients:
Z = null(A);
temp = ones(size(Z,2),1);
C = Z*temp

Using these values of coefficients, I can then calculate the value of f in the implicit equation by directly solving for it for each centroid.

1. Glitter is vertically co-linear

For the first simulation, I placed the camera and light to the right of the glitter line, and vertically co-linear to each other, as seen in the figure to the left. Here, the red vectors are the surface normals of the glitter and the blue vectors are the tangents, or the gradients, as calculated based on the surface normals.

  1. Using the first 5 pieces of glitter:
    1. Coefficients are as follows:
      • a = -0.0333
      • b = 0
      • c = 0
      • d = 0.9994
      • e = 0
    2. No plot - all terms with a y are zeroed out, so there is nothing to plot. Clearly not right...
  2. Using the last 5 pieces of glitter:
    1. Same results as above.

This leads me to believe there is something unsuitable about the glitter being co-linear like this.
For the second simulation, I placed the camera and the light to the right of the glitter line, but here they are not vertically co-linear with each other, as you can see in the figure to the left.

 

 

  1. Using the first 5 pieces of glitter:
    1. Coefficients are as follows:
      • a = 0.0333
      • b = 0
      • c = 0
      • d = -0.9994
      • e = 0
    2. No plot - all terms with a y are zeroed out, so there is nothing to plot. Clearly not right...
  2. Using the last 5 pieces of glitter:
    1. Same results as above.

If we move the camera and light to the other side of the glitter, there is no change. Still same results as above.

2. Glitter is not vertically co-linear

In this experiment, the glitter is "scattered" around the light and camera as seen in the figure to the left.

 

 

 

I had a slight victory here - I actually got concentric ellipses in this experiment when I move one of the centroids so that it was not co-linear with any of the others:

 

 

 

 

 

 

In the process of writing this post and running through all my previous failures, I found something that works; so, I am going to leave this post here. I am now working through different scenarios of this experiment and trying to understand how the linearity of the centroids affects the results (there is definitely something telling in the linearity and the number of centroids that are co-linear with each other). I will try to have another post up in the near future with more insight into this!

The last couple of days, I have focused on formally writing up the derivation in 2D of the constraints on the glitter that I am using in defining ellipses. I believe there is something wrong/incomplete in how I am thinking about the magnitude of the surface normals when using them to calculate the gradient vector. The difference in magnitude of the surface normals for each piece of glitter definitely has a bearing on the size of the ellipse associated with that piece of glitter.

I have attached my write-up to this post. In the write-up, there is a derivation of the constraints as well as my initial attempt at motivating this problem. I think I need to tie the motivation into the overall camera calibration problem instead of just talking about how the glitter can define ellipses.

Glitter_and_Ellipses-1xaxzep

My immediate next steps include re-working the last part of the derivation, the part which involves the magnitude of the surface normals (the ratio). I am also going to try to find other approaches to this problem. I REALLY believe the surface normals of the lit glitter is enough to determine the set of ellipses, so perhaps this implicit equation approach isn't the correct one! In the next day or so, I will put up a more comprehensive post on what results (including pretty/not-so-pretty pictures) I have achieved so far using the technique outlined in the write-up attached to this post.

1

The implicit equation for an ellipse looks like f(x, y) = ax^2 + bxy + cy^2 + dx + ey + f = 0 . The idea here is that if we have at least five pieces of glitter that are "on" for some location of the camera and light (unknown), and we know the surface normals of those pieces of glitter, then we can use that information to determine the values of the coefficients in the implicit equation, thus defining a set of concentric ellipses associated with our set of "on" glitter. Then, we think the two foci of these concentric ellipses (which will be the same for each ellipse in the set) will define the camera and light locations.

10 pieces of glitter are "on", and the light and camera lie alone a vertical line.

In this image, we can see a sample simulation in which there are 10 pieces of glitter, all of which are "on", and the camera and light are location along a vertical line. Here, we expect to see a set of concentric, vertically oriented (major axis aligned with the y-axis) ellipses such that each ellipse is tangent to at least one piece of glitter.

Previously, we thought that this set of concentric ellipses may be defined by some a, b, c, d, and e that are fixed for the whole set of ellipses, and an f which is different for each of the ellipses. In other words, a, b, c, d, and e defined the shape, orientation and location of the ellipses and f defined the "size" of each ellipse.

I am starting to believe that this is not quite true, and that the "division of labor" of the coefficients is not so clearly defined. Perhaps it is the case that there is some function which defines how the coefficients are related to each other for a given set of concentric ellipses, but I am not sure what that function or relationship is.