https://www.youtube.com/watch?v=5ResNQwydQg
The internet is a strange place. I've talked about using reaction videos as a set of free labels (Learn a Deep Network to map faces to an embedding space where images from the same time in an aligned video are mapped to the same place). Why is that good? There are lots of works on emotion recognition, but it is largely limited to "Happy" or "Sad" or "Angry", and often recognition works with pretty extreme facial expressions. But real expressions are more subtle, shaded, and interesting. But usually nobody uses them because there are no labels. We don't have strong labels, but we have weak labels (these images should have the same label).
And, lucky us! Someone has made a reaction video montage, like the one above, aligning all the videos already! (crazy!).
Not just one, here is another:
https://www.youtube.com/watch?v=u_jgBySia0Y
and, not just 2, but literally hundreds of them:
https://www.youtube.com/channel/UC7uz-e_b68yIVocAKGdN9_A/videos