Basics of Convolutional Neural Networks — Convolutions and Padding
The convolutional neural network (CNN) is a specialized type of neural network model designed for working with two-dimensional image data. Since detailed images can have incredibly high dimensions based on the number of pixels, CNNs provide convolutional operations for analyzing groups of pixels, which makes the fitting neural networks to large images feasible. Due to its properties, CNNs are important and useful for image classification, object detection in images, picture neural style transfer etc.
Convolutional Layers
To build a CNN neural network, you start with initializing a sequential model and go on adding layers. Besides simply adding additional dense layers or dropouts between them, we also need to investigate other potential layer architectures like convolutional layers. Convolutional layers are the major building blocks used in convolutional neural networks. A convolution is the simple application of a filter to an input that results in an activation.
A convolution is a linear operation that involves the multiplication which is performed between an array of input data and a two-dimensional array of weights, called a filter, which is typically smaller than the input data. The operation is element-wise multiplication applied between a filter-sized patch of the input and the filter. The output from multiplying the filter with the input array one time is a single value. Repeated application of the same filter to an input results in a map of activations called a feature map.
Padding
The reduction in the size of the input to the feature map is referred to as border effects. It is caused by the interaction of the filter with the border of the image. Fortunately, padding solves the problems by addinng one layer of pixels around the edges to preserve the image size with the filter. By adding a border of pixels to the image, we can add padding so that each pixel of the original image can be the center of a convolution window filter.