Convolutional Neural Network
Rawimage->convolution->ReLU->Pool->convolution->ReLU->Pool->FC->softmax
Convolution:
Convolution is a mathematical way of combining two signals to form a third signal. It is the single most important technique in Digital Signal Processing. Using the strategy of impulse decomposition, systems are described by a signal called the impulse response.
This is an interesting blog on convolution of Images :
Hence,
Input Image *(convolution)* Kernel = New form of Image (blurring, sharpening, embossing, edge detection, and more)
Kernel (image processing) ... In image processing, a kernel, convolution matrix, or mask is a small matrix. It is used for blurring, sharpening, embossing, edge detection, and more. This is accomplished by doing a convolution between akernel and an image.
The kernel is described here :
So, In our case we used Feature detector as a Kernel to preserve the features on the image, Actually we dont see all the things on a Images we just see what are important. Like in an image if we need dog we just get the dogs image features we remove other on background using this convolution procedure :
LikeWise We can create more Feature Maps from single image:
We can see how convolution kernel can change our image :
ReLU: (Rectified Linear Unit) :
f(x) = x when x > 0
f(x) = 0 when x < 0
Rectifier :
f(x)=max(0,x)
where x is the input to a neuron. This is also known as a ramp function and is analogous to half-wave rectification in electrical engineering.
Half-wave rectification: Here +ve and -ve cycles or AC is converted to 0v to Pv DC singnals.
Hence, Rectifier in Neural network example:
A unit employing the rectifier is also called a rectified linear unit (ReLU).A smooth approximation to the rectifier is the analytic function
f(x)=\ln(1+e^{x})
which is called the softplus function.The derivative of softplus is
Rectified linear units find applications in computer vision[3] and speech recognition[8][9] using deep neural nets.
More on Activation Function:
Here in our Image we use RELU to remove the gradual color changes from black to white or white to black, so any of the perfect black or white will remains.
POOLING:
Always the input images are not in one direction, So orientation and placement of features like eyes,ears , that lining like tears may be different. So to mitigate this problem we use pooling so that the variations of images with same features will not differ in prediction.
Eg:
Solution :
There are different variations of pooling, among them Max pooling good for now.
Max pooling :
Here we keep on shifting the inner square right and get maximum on that box and are Written on the right box .
Hence , if the 4 on the yellow part is rotated toward 1 to 2 positions clockwise, the Pooled Feature Map doesn’t change.
So , till now we are here :
Flattening :
The pooled feature map are flattened as shown below before sending them to input layer of ANN.
And hence, all steps are summed up here :
Full Connection:
Code is shown below:
# Convolutional Neural Network # Installing Theano # pip install --upgrade --no-deps git+git://github.com/Theano/Theano.git # Installing Tensorflow # Install Tensorflow from the website: https://www.tensorflow.org/versions/r0.12/get_started/os_setup.html # Installing Keras # pip install --upgrade keras # Part 1 - Building the CNN # Importing the Keras libraries and packages from keras.models import Sequential from keras.layers import Convolution2D from keras.layers import MaxPooling2D from keras.layers import Flatten from keras.layers import Dense # Initialising the CNN classifier = Sequential() # Step 1 - Convolution classifier.add(Convolution2D(32, 3, 3, input_shape = (64, 64, 3), activation = 'relu')) # Step 2 - Pooling classifier.add(MaxPooling2D(pool_size = (2, 2))) # Adding a second convolutional layer classifier.add(Convolution2D(32, 3, 3, activation = 'relu')) classifier.add(MaxPooling2D(pool_size = (2, 2))) # Step 3 - Flattening classifier.add(Flatten()) # Step 4 - Full connection classifier.add(Dense(output_dim = 128, activation = 'relu')) classifier.add(Dense(output_dim = 1, activation = 'sigmoid')) classifier.compile(optimizer='adam',loss = 'binary_crossentropy',metrics=['accuracy']) from keras.preprocessing.image import ImageDataGenerator train_datagen = ImageDataGenerator(rescale = 1./255, shear_range = 0.2, zoom_range = 0.2, horizontal_flip = True) test_datagen = ImageDataGenerator(rescale = 1./255) training_set = train_datagen.flow_from_directory('Your training_set', target_size = (64, 64), batch_size = 32, class_mode = 'binary') test_set = test_datagen.flow_from_directory('Your test_set', target_size = (64, 64), batch_size = 32, class_mode = 'binary') output = classifier.fit_generator(training_set, samples_per_epoch = 8000, nb_epoch = 25, validation_data = test_set,nb_val_samples = 2000) classifier.compile(optimizer='adam',loss = 'binary_crossentropy',metrics=['accuracy']) test_dog = test_datagen.flow_from_directory('Your Random input data', target_size = (64, 64), batch_size = 32, class_mode = 'binary') predictions = classifier.predict_generator(test_dog,steps=1,max_q_size=1,verbose=0) for i in range(0,1): if predictions[i, 0] >= 0.6 : print('I am sure this is a Dog'.format(predictions[i][0])) else: print('I am sure this is a Cat'.format(predictions[i][0]))
Comments
Post a Comment