Skip to content

Training and visualization of Resnet18 on terra data

Terra dataset contains 350,000 sorghum images from day 0 to day 57. Images from continuous 3 days are grouped into a class, forming 19 class in total. The following shows samples from each class:

All images are randomly divided into train set and test set with ratio 8:2. A Resnet18 pre-trained on ImageNet are fine-tuned on the train set (lr = 0.01, epoch = 30). The training history of network (with and without zero epoch) is the following:

  1. At epoch 0, train_acc and test_acc are both 5%. Resnet randomly predict the one of each class
  2. The first 3 epoch dramatically push the train_acc and test_acc to 80%
  3. Network converge to train_acc = 95% and test_acc = 90%

The confusion matrix on test set is the following:

When network makes wrong prediction, it mistakenly predict the sorghum image to neighboring class.

Several samples of wrong prediction is shown in the following:

At (2,4) and (5,5) network do not even predict neighboring classes. It can be seen that these images are not very 'typical' in their class. But the prediction is still hard to explain.

At (4,6) the image is 'typical' in class 1. but predicted to class 5, which is mysterious.

Deepdream is applied to the network to reveal what the network learns:

The structure of resnet18 is given as follow:

An optimization of output of conv2_x, conv3_x, conv4_x, conv5_x and fc layer is conducted:

original image:

conv2,3,4,5:

fc layer:

As the receptive field increase, it can be observed that network learns more complex local structure (each small patch becomes less similar) instead of global structure (a recognizable plant). Maybe the local texture is good enough to classify the image?

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *