Last week I decided to pursue the optimization route to try and find the light & camera locations simultaneously. This post will focus on the progress and results thus far in the 3D optimization simulation!
In my simulation, there are 10,000 centroids all arranged in a grid on a plane (the pink shaded plane in the image below. There is a camera (denoted by the black dot) and a light (denoted by the red dot). I generate a random screen map - a list of positions on the monitor (blue shaded plane) such that a position on the monitor corresponds to a centroid. I use this screen map and the centroid locations to calculate the actual surface normals of each centroid - we will refer to these as the ground truth normals.
Then, I assume that all of the centroids are reflecting the point light (red dot), and calculate the surface normals of the centroids under this assumption - we will refer to these as the calculated normals. The centroids which are considered to be "lit" are those whose ground truth normals are very close to their calculated normals (using the dot product and finding all centroids whose normals are within ~2 degrees of each other - dot product > 0.999).
This visualization shows the centroids which are "lit" by the light and the rays from those centroids to their corresponding screen map location. As expected, all of these centroids have screen map locations which are very close to the light.
To optimize, I initialize my camera and light locations to something reasonable, and then minimize my error function.
Error Function
In each iteration of the optimization, I have some current best camera location and current best light location. Using these two locations, I can calculate the surface normals of each lit centroid - call these calculated normals. I then take the dot product of the ground truth normals and these calculated normals, and take the sum over all centroids. Since these normals are normalized, I know each centroid's dot product can contribute no more than 1 to the final sum. So, I minimize the function:
numCentroids - sum(dot(ground truth normals, calculated normals))
Results
No pictures because my current visualizations don't do this justice - I need to work on figuring out better ways to visualize this after the optimization is done running/while the optimization is happening (as a movie or something).
Initial Light: [80 50 50]
Initial Camera: [50 60 80]
Final Light: [95.839 80.2176 104.0960]
Final Camera: [118.3882 26.4220 61.7301]
Actual Light: [110 30 57.5]
Actual Camera: [100 80 110]
Final Error: 0.0031
Error in Lit Centroids: 0.0033
Discussion/Next Steps
1. Sometimes the light and camera locations get flipped in the optimization - this is to be expected because right now there is nothing constraining which is which. Is there something I can add to my error function to actually constrain this, ideally using only the surface normals of the centroids?
2. The optimization still seems to do less well than I would want/expect it to. It is possible that there is a local min that it is falling into and stopping at, so this is something I need to look at more.
3. It is unclear how much the accuracy (or lack thereof) affects the error. I want to try to perturb the ground truth surface normals by some small amount (pretend like we know there is some amount of error in the surface normals, which in real life there probably is), and then see how the optimization does. I'm not entirely sure what the best way to do this is, and I also am not sure how to go about measuring this.