Convolutional Neural Network

Rawimage->convolution->ReLU->Pool->convolution->ReLU->Pool->FC->softmax

Convolution:

Convolution is a mathematical way of combining two signals to form a third signal. It is the single most important technique in Digital Signal Processing. Using the strategy of impulse decomposition, systems are described by a signal called the impulse response.

This is an interesting blog on convolution of Images :

http://www.songho.ca/dsp/convolution/convolution.html#convolution_2d

Hence,

Input Image *(convolution)* Kernel = New form of Image (blurring, sharpening, embossing, edge detection, and more)

Kernel (image processing) ... In image processing, a kernel, convolution matrix, or mask is a small matrix. It is used for blurring, sharpening, embossing, edge detection, and more. This is accomplished by doing a convolution between akernel and an image.

The kernel is described here :

See the Animation

http://setosa.io/ev/image-kernels/

see description,

https://en.wikipedia.org/wiki/Kernel_(image_processing)

So, In our case we used Feature detector as a Kernel to preserve the features on the image, Actually we dont see all the things on a Images we just see what are important. Like in an image if we need dog we just get the dogs image features we remove other on background using this convolution procedure :

LikeWise We can create more Feature Maps from single image:

We can see how convolution kernel can change our image :

https://docs.gimp.org/en/plug-in-convmatrix.html

ReLU: (Rectified Linear Unit) :

f(x) = x when x > 0

f(x) = 0 when x < 0

Rectifier :

In the context of artificial neural networks, the rectifier is an activation function defined as:

f(x)=max(0,x)

where x is the input to a neuron. This is also known as a ramp function and is analogous to half-wave rectification in electrical engineering.

Half-wave rectification: Here +ve and -ve cycles or AC is converted to 0v to Pv DC singnals.

Hence, Rectifier in Neural network example:

A unit employing the rectifier is also called a rectified linear unit (ReLU).A smooth approximation to the rectifier is the analytic function

f(x)=\ln(1+e^{x})

which is called the softplus function.The derivative of softplus is

f'(x)=e^{x}/(e^{x}+1)=1/(1+e^{-x}) , i.e. the logistic function.

Rectified linear units find applications in computer vision [3] and speech recognition [8][9] using deep neural nets.

More on Activation Function:

https://www.youtube.com/watch?v=9vB5nzrL4hY

Here in our Image we use RELU to remove the gradual color changes from black to white or white to black, so any of the perfect black or white will remains.

POOLING:

Always the input images are not in one direction, So orientation and placement of features like eyes,ears , that lining like tears may be different. So to mitigate this problem we use pooling so that the variations of images with same features will not differ in prediction.

Eg:

Solution :

There are different variations of pooling, among them Max pooling good for now.

Max pooling :

Here we keep on shifting the inner square right and get maximum on that box and are Written on the right box .

Hence , if the 4 on the yellow part is rotated toward 1 to 2 positions clockwise, the Pooled Feature Map doesn’t change.

So , till now we are here :

Flattening :

The pooled feature map are flattened as shown below before sending them to input layer of ANN.

And hence, all steps are summed up here :

Full Connection:

Code is shown below:

# Convolutional Neural Network
# Installing Theano
# pip install --upgrade --no-deps git+git://github.com/Theano/Theano.git
# Installing Tensorflow
# Install Tensorflow from the website: https://www.tensorflow.org/versions/r0.12/get_started/os_setup.html
# Installing Keras
# pip install --upgrade keras
# Part 1 - Building the CNN
# Importing the Keras libraries and packages
from keras.models import Sequential
from keras.layers import Convolution2D
from keras.layers import MaxPooling2D
from keras.layers import Flatten
from keras.layers import Dense

# Initialising the CNN
classifier = Sequential()

# Step 1 - Convolution
classifier.add(Convolution2D(32, 3, 3, input_shape = (64, 64, 3), activation = 'relu'))

# Step 2 - Pooling
classifier.add(MaxPooling2D(pool_size = (2, 2)))

# Adding a second convolutional layer
classifier.add(Convolution2D(32, 3, 3, activation = 'relu'))

classifier.add(MaxPooling2D(pool_size = (2, 2)))

# Step 3 - Flattening
classifier.add(Flatten())

# Step 4 - Full connection
classifier.add(Dense(output_dim = 128, activation = 'relu'))
classifier.add(Dense(output_dim = 1, activation = 'sigmoid'))
classifier.compile(optimizer='adam',loss = 'binary_crossentropy',metrics=['accuracy'])
from keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(rescale = 1./255,
                                   shear_range = 0.2,
                                   zoom_range = 0.2,
                                   horizontal_flip = True)
                                   
test_datagen = ImageDataGenerator(rescale = 1./255)


training_set = train_datagen.flow_from_directory('Your training_set',
                                                 target_size = (64, 64),
                                                 batch_size = 32,
                                                 class_mode = 'binary')

test_set = test_datagen.flow_from_directory('Your test_set',
                                                target_size = (64, 64),
                                                batch_size = 32,
                                                class_mode = 'binary')

output  = classifier.fit_generator(training_set,
                         samples_per_epoch = 8000,
                         nb_epoch = 25,
                                   
                                   
                         validation_data = test_set,nb_val_samples = 2000)
classifier.compile(optimizer='adam',loss = 'binary_crossentropy',metrics=['accuracy'])

test_dog = test_datagen.flow_from_directory('Your Random input data',
                                                target_size = (64, 64),
                                                batch_size = 32,
                                                class_mode = 'binary')

predictions = classifier.predict_generator(test_dog,steps=1,max_q_size=1,verbose=0)
for i in range(0,1):
    if predictions[i, 0] >= 0.6 :
            print('I am  sure this is a Dog'.format(predictions[i][0]))
    else:
         print('I am sure this is a Cat'.format(predictions[i][0]))

Search This Blog

Ashis Parajuli Blogs

Convolutional Neural Network

Comments

Post a Comment

Popular posts from this blog

DIfferent issues that may occur in Apache spark and their remedies.

Parquet is a column based data store or File Format (Useful for Spark read/write and SQL in order to boost performance)

Nepali New Year