Tutorial: Batch Generator Output Transform Function
In this tutorial we will explore the BatchGenerator class output_transform_func parameter and how you can create your own custom code to handle any type of input before you feed the data to the Neural Network.
This is a more in depth tutorial of one functionality of Ketos.
Note
You can download an executable version (Jupyter Notebook) of this tutorial
here
.
Tutorial: Creating an Output Transform Function¶
Overview¶
The output_transform_func of the BatchGenerator class is a powerful tool that allows the data and labels from a dataset to be transformed right before they are fed to a Deep Neural Network (DNN). This can save us a lot of time in pre-processing the data to comply with the format expected by the NN architecture. While ketos includes a default transform function that will work for many cases, this method can and should be overwritten to accomodate any transformations required. Common operations are reshaping of input arrays and parsing or one hot encoding of the labels.
In this tutorial, you will learn how to use the BatchGenerator transform function to adjust your data to the format expected by the NN. This tutorial is divided into three parts. They are:
- Overview of the transform batch function
- Mapping labels to numeric classes
Creating a Transform Function¶
In most applications, the data we feed to a neural network during training can be represented with one or two dimensions. This is the case with audio as either a waveform (1-D) or as a spectrogram representation (2-D). This data is then fed to a NN in batches. Ketos' BatchGenerator provides an interface that can help us to easily deal with splitting our data in Batches and outmatically feeding these batches to the NN.
Let's first create some sample 2-D data to work with. For the sake of this tutorial, our data is randomly generated but the same steps would apply to spectrograms.
import numpy as np
# Create a 100 5x5 arrays
inputs = np.random.rand(100,5,5)
inputs.shape
(100, 5, 5)
The NN architectures provided with ketos expects a multidimensional array data in the follwing format: (batch_size, width, height, channels). As we can see from the input shape above, our data is still not in this format. Here, we have two options: we can either transform the input data before passing it to the NN, or do it when creating the batches during training by using the output_transform_func parameter of the BatchGenerator class. The main disadvantage of the first approach is that we would have to duplicate our entire dataset anytime we need to accomodate the input to a NN specification. On the other hand, by delaying this transformation when loading the data into memory, we can efficiently change how our data is transformed by simply modifying the transform function.
Let us create a transform function that will take input we just created and tranform it to (batch_size, width, height, 1). We use a constant 1 as for the nubmer of channels because we are processing spectrograms which are grayscale.
def my_output_transform_func(x):
# we are transforming our input x to the desired shape
X = x.reshape(x.shape[0], x.shape[1], x.shape[2], 1)
return X
output = my_output_transform_func(inputs)
output.shape
(100, 5, 5, 1)
While the transformation we did is quite simple, the output transform function can accomodate any other operations and transformation.
In adition to the input data, we have to pass labels associated to our data during training. The output transform function can also deal with transforming the labels to the right format. Often NNs expect the labels to be one-hot encoded during training.
Let us create some labels for our data:
# Creating labels
ones = np.ones(50, dtype=int) # 50 entries will have label 0
zeroes = np.zeros(50, dtype=int) # 50 entries will have label 1
labels = np.concatenate((zeroes, ones))
Now in our transform_output_function, let transform the labels we just created to one-hot encoded vectors.
def _to1hot(class_label, n_classes=2):
""" Create the one hot representation of class_label
Args:
class_label: int
An integer number representing the class label
n_class: int
The number of classes available
Returns:
one_hot: numpy.array
The one hot representation of the class_label in a 1 x n_classes array.
Examples:
>>> _to1hot(class_label=0, n_classes=2)
array([1., 0.])
>>> _to1hot(class_label=1, n_classes=2)
array([0., 1.])
>>> _to1hot(class_label=1, n_classes=3)
array([0., 1., 0.])
>>> _to1hot(class_label=1, n_classes=5)
array([0., 1., 0., 0., 0.])
"""
one_hot = np.zeros(n_classes)
one_hot[class_label]=1.0
return one_hot
def my_output_transform_func(x, y):
X = x.reshape(x.shape[0], x.shape[1], x.shape[2], 1) # transforming the data to (batch_size, width, height, 1)
Y = np.array([_to1hot(class_label=label, n_classes=2) for label in y]) # one hot-encoding the labels
return (X, Y)
output = my_output_transform_func(inputs, labels)
output[1][0]
array([1., 0.])
We now have all our pieces to use the BatchGenerator to automatically create the batches for us. Lets us create batches of 10.
from ketos.data_handling.data_feeding import BatchGenerator # importing theBatchGenerator class from ketos
generator = BatchGenerator(batch_size=10, x=inputs, y=labels, output_transform_func=my_output_transform_func)
batch_X, batch_Y = next(generator)
batch_X.shape
(10, 5, 5, 1)
batch_Y
array([[1., 0.], [1., 0.], [1., 0.], [1., 0.], [1., 0.], [1., 0.], [1., 0.], [1., 0.], [1., 0.], [1., 0.]])
Now, anytime we wish to adapt our input to a new expecification, we can just modify the transform_output_function we created to accomodate the changes instead of having to pre-process our entire dataset.
On training a Neural Network using the BatchGenerator, please refer to the Training a Binary ResNet Classifer tutorial.
Mapping labels to numerical classes¶
The one hot encoing function we created above requires that our labels be incrmental integers starting from 0, 1, 2.... n.
If we try and pass labels that dont follow this pattern, an error will be thrown. Let us create labels 2 and 3 instead of 0 and 1.
#generating random data
inputs = np.random.rand(20,5,5)
# Creating labels
twos = np.full(10, 2, dtype=int)
threes = np.full(10, 3, dtype=int)
labels = np.concatenate((twos, threes))
labels
array([2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3])
output = my_output_transform_func(inputs, labels)
--------------------------------------------------------------------------- IndexError Traceback (most recent call last) Cell In [40], line 1 ----> 1 output = my_output_transform_func(inputs, labels) Cell In [31], line 33, in my_output_transform_func(x, y) 31 def my_output_transform_func(x, y): 32 X = x.reshape(x.shape[0], x.shape[1], x.shape[2], 1) # transforming the data to (batch_size, width, height, 1) ---> 33 Y = np.array([_to1hot(class_label=label, n_classes=2) for label in y]) # one hot-encoding the labels 34 return (X, Y) Cell In [31], line 33, in <listcomp>(.0) 31 def my_output_transform_func(x, y): 32 X = x.reshape(x.shape[0], x.shape[1], x.shape[2], 1) # transforming the data to (batch_size, width, height, 1) ---> 33 Y = np.array([_to1hot(class_label=label, n_classes=2) for label in y]) # one hot-encoding the labels 34 return (X, Y) Cell In [31], line 28, in _to1hot(class_label, n_classes) 2 """ Create the one hot representation of class_label 3 4 Args: (...) 25 array([0., 1., 0., 0., 0.]) 26 """ 27 one_hot = np.zeros(n_classes) ---> 28 one_hot[class_label]=1.0 29 return one_hot IndexError: index 2 is out of bounds for axis 0 with size 2
To accomodate these labels, we can easily modify our transform to map whatever labels we have to be incremental integers starting from 0.
unique_labels = [2, 3]
def my_output_transform_func(x, y):
X = x.reshape(x.shape[0], x.shape[1], x.shape[2], 1) # transforming the data to (batch_size, width, height, 1)
# Creating mapping dict. Label 2 will be assigned to 0 and label 3 will be assigned to 1
mapper = dict(zip(unique_labels, np.arange(len(unique_labels)))) # mapper = {2: 0, 3: 1}
y = [mapper[label] for label in y] # mapping the labels
Y = np.array([_to1hot(class_label=label, n_classes=2) for label in y]) # one hot-encoding the labels
return (X, Y)
output = my_output_transform_func(inputs, labels)
output[1]
array([[1., 0.], [1., 0.], [1., 0.], [1., 0.], [1., 0.], [1., 0.], [1., 0.], [1., 0.], [1., 0.], [1., 0.], [0., 1.], [0., 1.], [0., 1.], [0., 1.], [0., 1.], [0., 1.], [0., 1.], [0., 1.], [0., 1.], [0., 1.]])