This notebook contains optional study material. You are not required to work through it in order to meet the learning objectives or complete the assessments associated with this module.

Brief overview: the notebook demonstrates how to generate additional training and testing image data/image label pairs by perturbing the images in the original training set.

4 Testing a network against synthetic data (optional)

One of the problems associated with training neural networks is that we often need a lot of data in order to be able to train the network effectively.

One way of increasing the amount of data available to us is to generate synthetic data to supplement our collected datasets. This synthetic data may be derived from our original datasets, or created ‘from scratch’.

For example, with the MNIST handwritten digit image dataset, we can create derived datasets by cropping the original digits and translating them around the 28 × 28 pixel training image view. We could create completely synthetic data by taking images from computer fonts, and perhaps adding noise to them, to create new digit training images.

This notebook contains quite a lot of complex code, but you are not expected to be able to write code of this complexity, nor even to necessarily understand it. Instead, it is provided as a demonstration of what sorts of steps are required to perform particular actions, and what sort of code can be used to implement those steps.

So treat the notebook as if you were being given a tour of a working engineering lab: skim over the code definitions (unless you are particularly interested) and just observe the effects of calling the functions that have been implemented.

4.1 Creating new images from old – translating cropped images

To start with, let’s load in the original MNIST data as a list of images, along with the corresponding MNIST labels, using a function based on the code we used in the previous notebook:

from nn_tools.sensor_data import load_MNIST_images_and_labels

images_list, labels = load_MNIST_images_and_labels()

We can grab random samples from this data as before:

from nn_tools.sensor_data import get_random_image
                 
(test_image, test_label) = get_random_image(images_list, labels, show=True)

You have already seen how we can get rid of the ‘background’ columns and rows around the outside of an image using the nn_tools.sensor_data.trim_image() function:

from nn_tools.sensor_data import trim_image

# Pass the parameter show=False to hide the display
# of the untrimmed and trimmed dataframes
trimmed_image_df = trim_image( test_image, background=0)

We can also convert this dataframe back into an image:

from nn_tools.sensor_data import image_from_df, zoom_img

cropped_image = image_from_df(trimmed_image_df)
zoom_img( cropped_image )

cropped_image.size

If we create a blank image of size 28 × 28 pixels with a grey background, then we can paste a copy of our cropped image into it. The nn_tools.sensor_data.jiggle() function implements this approach. It will accept an image and then return a randomly translated version of it.

Run the following cell several times to see the effect of the jiggle() function on a test image.

from nn_tools.sensor_data import jiggle

jiggled_image =  jiggle(test_image)
zoom_img(jiggled_image)

# Show size
jiggled_image.size

4.1.1 Activity – Translating the digit within the image frame

Explore how the sensor_data.jiggle() function works in practice. Run the following cell multiple times, observing what happens in each case, to see how the differently translated versions of the image are returned each time the function is called. Is there much variation in how the digit is centred in the image frame?

from nn_tools.sensor_data import jiggle

# Setting quiet=False displays the original input image
# as well as returning the jiggled image as cell output
zoom_img( jiggle(test_image, quiet=False) )

Record any notes or observations here.

Example discussion

Click on the arrow in the sidebar or run this cell to reveal an example discussion.

The jiggle function slightly translates the image to the left, right, and up and down within the image area. However, it never seems to be translated so far that bits of it get cut off.

4.2 Testing the MLP against derived images

Load in the MLP you saved at the end of the last notebook:

from joblib import load

MLP = load('mlp_mnist_28x28.joblib')

Check that it still works with the original dataset:

from nn_tools.network_views import test_and_report_random_images

test_and_report_random_images(MLP,
                              get_random_image, images_list, labels,
                              num_samples=100)

4.2.1 Activity – Testing the MLP against translated digit images

Now let’s see how well our trained MLP responds to translated versions of the original training images.

Start by testing the network against one of the original images:

from nn_tools.network_views import predict_and_report_from_image

test_image, test_label = get_random_image(images_list, labels)

predict_and_report_from_image(MLP, test_image,
                              test_label, quiet=False, confidence=True)

How does the trained MLP fare if we translate the image? Run the following cell several times and see if the MLP continues to classify the digit correctly.

predict_and_report_from_image(MLP, test_image, test_label,
                              jiggled=True, quiet=False, confidence=True)

Record your observations about how well the MLP performs against the translated images here. Why do you think the network is performing the way it does?

Example discussion

Click on the arrow in the sidebar or run this cell to reveal an example discussion.

When I tested the network against the translated/jiggled images, I found that it wasn’t very reliable at classifying them.

Although the digits are the same size as the original digits, the original MLP has no real sense of how the pixels representing the digits relate to each other according to their relative location.

Instead, it is looking for pixels that overlap the pixels representing the digit that were presented in the original training set. If we translate the digit in the image frame, it may end up overlapping the pixels associated with the original location of another digit more than it overlaps the pixels associated with its own originally located image.

4.3 Creating more new images from old – zooming cropped images

Another way of transforming the cropped images is to magnify them back to the original image size. (The rescaling employs a digital filter that is used to interpolate new pixel values in the scaled image based on the pixel values in the cropped image. As such, it may introduce digital artefacts of its own into the scaled image.)

from nn_tools.sensor_data import crop_and_zoom_to_fit

crop_zoomed_image = crop_and_zoom_to_fit(test_image)

zoom_img( crop_zoomed_image )

# Show size
crop_zoomed_image.size

To simplify the process of applying these transformations to a test image, we can call the predict_and_report_from_image() function with the jiggled=True and cropzoom=True parameters:

# Test a jiggled version of the provided image
predict_and_report_from_image(MLP, test_image, test_label,
                              jiggled=True, quiet=False)

# Test a cropped and then zoomed version of the provided image
predict_and_report_from_image(MLP,
                              test_image, test_label,
                              cropzoom=True, quiet=False)

You can also pass the zoomview=True parameter to view the image at a larger scale.

4.3.1 Activity – Rescaling the digit within the image frame

How well does the trained MLP work if we rescale the image by cropping it and then zooming it to fit the original image size?

predict_and_report_from_image(MLP,
                              test_image, test_label,
                              cropzoom=True, quiet=False, confidence=True)

Record your observations about how well the MLP performs against the translated images here. Why do you think the network is performing the way it does?

Example discussion

Click on the arrow in the sidebar or run this cell to reveal an example discussion.

When I tested the network against the resized images, I found that it wasn’t very reliable at classifying them.

The original MLP has no real sense of how images are scaled across the presented image frame: it is looking for pixels that overlap the pixels representing the digits that were presented in the original training set.

4.3.2 Activity – Testing the network against lots of jiggled and zoomed images

Run various combinations of the following test code to see how well the network behaves when tested against lots of transformed images. How well does it perform?

test_and_report_random_images(MLP, 
                              get_random_image, images_array=images_list, labels=labels,
                              num_samples=100, jiggled=True, cropzoom=False)

## 4.4 Training the network on transformed images

The MNIST images we have been provided with each have dimensions of 28 × 28 pixels. If we want to try to classify a handwritten digit image using the MLP trained against these MNIST images, we need to resize the image to the same size.

In a later notebook, you will be using an MLP to try to classify handwritten digit images scanned in from the simulator. These image scans have size 14 × 14 pixels. If we were to resize those collected images and then present them to our network, the scaling up of the image may introduce digital artefacts that affect the classification.

So instead, let’s take the opportunity now to create an MLP trained on resized handwritten images, scaled down to a size of 14 × 14 pixels. This will further review the process of how we train an MLP.

To being with, let’s create a set of test images. The test images will be created using the following pipeline:

  • generate an image from the image array data

  • resize the image from 28 × 28 pixels down to 14 × 14 pixels

  • convert the resized image to a black-and-white image using a specified threshold.

from nn_tools.network_views import resized_images_pipeline

resized_images = resized_images_pipeline(images_list, size=(14, 14))

Now create the initial network architecture. We have simplified the data both by reducing the dimensions of the images (and hence the number of input nodes required) and also moved away from a discrete greyscale image representation to a binary black-and-white image representation.

So let’s use a simpler network.

Let’s try with just a single layer of 10 neurons to start with. We can use the quick_progress_tracked_training() function to provide a simple way of creating an MLP and tracking its training progress:

from nn_tools.network_views import quick_progress_tracked_training


hidden_layer_sizes = (20)
max_iterations = 150


test_limit = 100
train_limit = len(resized_images) - test_limit


resized_training_images = resized_images[:train_limit]
training_labels = labels[:train_limit]

resized_testing_images = resized_images[train_limit:]
testing_labels = labels[train_limit:]

MLP2 = quick_progress_tracked_training(resized_training_images, training_labels,
                                 hidden_layer_sizes=hidden_layer_sizes,
                                 # top up an existing reptrained MLP: MLP=MLP,
                                 max_iterations=max_iterations,
                                 loss=True, # show loss function
                                 structure=True # show network params
                                )

Record your observations here about how well the MLP performs during training.

How well does the network perform on the unseen test samples?

from nn_tools.network_views import test_and_report_image_data
from nn_tools.sensor_data import get_images_features

resized_testing_data = get_images_features(resized_testing_images, normalise=True)
test_and_report_image_data(MLP2, resized_testing_data, testing_labels)

Record your observations here about how well the network performs on the previously unseen data.

How well does this network perform on jiggled images?

test_and_report_random_images(MLP2, 
                              get_random_image, images_array=resized_testing_images, labels=testing_labels,
                              num_samples=100, jiggled=True, cropzoom=False)

Record your observations here about how well the network performs when tested with the jiggled image data.

When I tried, it appeared not to perform very well at all with the jiggled images.

Let’s save this network:

from joblib import dump

# Save the network
dump(MLP2, 'mlp_mnist14x14.joblib') 

4.4.1 Rapidly training the MLP

Use the following ipywidgets application to easily try out different network structures when training the MLP using the resized image training data.

# Train the network

from ipywidgets import interact_manual

MLP3=None

@interact_manual(iterations=(100, 2000, 100),
          h1=(0, 10, 1), h2=(0, 10, 1))
def trainer(iterations=100, h1=6, h2=6, updater=False):
    global MLP3
    MLP3 = quick_progress_tracked_training(resized_training_images, training_labels,
                                 hidden_layer_sizes=hidden_layer_sizes,
                                 max_iterations=40,
                                 MLP = MLP3 if updater else None,
                                 loss=True, # show loss function
                                 structure=True # show network params
                                )

Test the new network using the previously unseen test data:

test_and_report_image_data(MLP3, resized_testing_data, testing_labels)

Feel free to experiment with training and testing the network using different setups. But don’t spend too much time playing!

Record any notes and observations you care to make about your experimentation here.

Save the trained network:

dump(MLP3, 'resized_images_MLP.joblib') 

4.5 Summary

In this notebook, you have seen how we can create synthetic training data derived from the original dataset, in particular by cropping the original handwritten digits and then translating them to a new location within the original 28 × 28 pixel image frame, or zooming the cropped image to fit the original image frame.

You also saw how we can create a derived dataset from the original images consisting of images in a smaller image frame (14 × 14 rather than 28 × 28 pixels) and then train a new MLP on that. As a result of reducing the number of input features, we could also get away with using a smaller neural network to recognise the supplied images.

On testing the original MNIST data trained network against the translated and zoomed images, you also saw how the network performance was considerably degraded.

In the next notebook, you will learn how we can improve the network’s performance by finding a new way of presenting the images to the network using many fewer, but far more relevant, input features.