Introducing Dropout and L2 Regularization in ConvNets
In this tutorial, I will show how dropout and L2 regularization affect the convolutional neural networks. Its same as we did in the simple neural network. You will see a minor increase in the accuracy but this is not our main concern here. The main concern here to avoid overfitting using these two techniques.
Modify the previous version of ConvNets python code. If you have confusion in ConvNets
, try this intro tutorial.
L2 Regularization
We have four weights and all multiplied with the regularization constant 0.005. Similar to previous L2 regularization, change the loss with the following code:
rm = 0.005
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits, tf_train_labels)+ rm * tf.nn.l2_loss(layer1_weights) + rm * tf.nn.l2_loss(layer2_weights) + rm * tf.nn.l2_loss(layer3_weights) + rm * tf.nn.l2_loss(layer4_weights))
Dropout
Dropout is a great technique to avoid overfitting problem, Dropout play a great role if you train for long, for example 50k iterations, we usually train for around 10kiteration because of the memory size and time duration.
One more thing, dropout is likely to work well only when used with fully connected layers. It will not work well between the conv layers because the purpose of convnets is to map the image to smaller size with a greater feature space whereas dropout is a technique to randomly drop the weights to avoid overfitting and this will not make any sense if used between the conv layers.
I would recommend you to try it between the conv and fully connected layer. Ihave used only between the fully connected layers.
For ease I have made a new function, copy the below function above or below the model function.
def model_dropout(data):
keep_prob = tf.Variable(0.5,tf.float32)
conv_1 = tf.nn.conv2d(data, layer1_weights, [1, 2, 2, 1], padding='SAME')
hidden_1 = tf.nn.relu(conv_1 + layer1_biases)
pool_1 = tf.nn.max_pool(hidden_1,[1, 2, 2, 1],[1, 2, 2, 1], padding='SAME')
conv_2 = tf.nn.conv2d(pool_1, layer2_weights, [1, 2, 2, 1], padding='SAME')
hidden_2 = tf.nn.relu(conv_2 + layer2_biases)
pool_2 = tf.nn.max_pool(hidden_2,[1, 2, 2, 1],[1, 2, 2, 1], padding='SAME')
shape = pool_2.get_shape().as_list()
reshape = tf.reshape(pool_2, [shape[0], shape[1] * shape[2] * shape[3]])
out_layer = tf.nn.relu(tf.matmul(reshape, layer3_weights) + layer3_biases)
drop = tf.nn.dropout(out_layer,keep_prob)
return tf.matmul(drop, layer4_weights) + layer4_biases
Also change the logits assignment to:
logits = model_dropout(tf_train_dataset)
Get full python code of convolutional neural network with dropout and regularization.
I was able to get 93.8%
from the above code. Try more different batch size, step size and add learning rate decay, and try to get atleast 95% accuracy. In the next tutorial, I will share the maximum accuracy I got.