Ideas for Reverse phenotyping

Data we have:

X: Scans from different day, different plot

Y₁ (labels): Date-plot labels

Y₂ (features): Small fraction of hand measured data (distributed unevenly on dates)

Target:

Predict the features (leaf length, canopy cover, plant count etc.) for each plot (or scan images).

Brief Structure:

Use scan images and date-plot label to train a CNN that embeds the images to a high-dimension space.
Use the hand measured feature to find a (transformed) subspace of the embedded space that can describe that feature.
Use the space to interpolate the unmeasured images.

Assumption

Hidden features: With some explicit labels, the CNN can somehow learn implicit features, i.e., using embedded space without or with few transformations, we can have a mapping from that space to a linear feature space.

Ideas

Embedding part:

The label is continuous instead of class-like. Does embedding a dataset like this with triplet or n-pair seem not reasonable? I think we need a criterion that related to continuous labels.
More criterions: Since we have more features instead of only date and plot, is it possible to use more criterions to minimize?
Single image as a class: Non-parametric instance-level discrimination. Maybe it could find some interesting structures of the dataset.

If the transformation is not needed:

Find the dimension of embedded space that most depends on the feature.

If the transformation is linear:

Linear regression based on data points on embedded space and features as target
PCA

If the transformation is non-linear:

k-NN regression (Not working)

Doing kNN regression has the curse of dimensionality issue. For a single point trying to acquire a feature on embedded space, we need a large k, or it always tends to find the nearest cluster to predict. But we don’t have enough y₂ to do so.

More layers for different features:

Concatenate some small layers for a single feature (inspired by R-CNN)

NICE:

Using NICE on top of the embedded space, then find the directions best describe the features.

An issue of we proposed in the paper discussion is that NICE or GLOW is that it focus on some fixed dimensions, the image that used to demonstrate the GLOW model are all well aligned (the parts (eyes, nose, etc.) of face always appear on approx. same region of image (same dimensions of a vector)). But on embedded space, each dimension should have some fixed meaning. So, is it possible to use this kind of model to map the embedded space into another space that have independent and continuous dimensions?

Supervised Manifold learning:

Most manifold learning methods are unsupervised. But is it possible to add some constrain? For example, find a 2D manifold of a 3D sphere that x-axis is based on latitude.