Tuning parameters to improve Accuracy of Neural Network
Parameter tuning is one of the toughest task in major Machine Learning model. Some times it work with a hit and trial method and sometimes need a lot to think to make the model work better. In this tutorial I will show how changing the parameters directly affect the accuracy of Neural Network. You might see the parameters that affect the accuracy.
You will see how the accuracy increases with the change of certain parameters. I have tested a lot on these by varying the parameters but again there can be alot more work and I would like you to do and share with the Tensorflow hub community.
Regularization constant used here is 0.01
if not mentioned explicitly.
Note: Mobile Users use Landscape view for better experience.
Single Hidden Layer Neural Network
LearningRate = 0.01
With no Regularization:
Hidden Units
1024
1024
Iteration
801
801
Standard Deviation
1
1
Batch size
10
100
Accuracy
57.6
91.2
With L2:
Hidden Units
1024
1024
Iteration
801
801
Standard Deviation
1
1
Batch size
10
100
Accuracy
75.6
91.4
With Dropout:
Here I have also tested with more iteration since dropout works well with more steps.
Hidden Units
1024
1024
1024
1024
1024
1024
Iteration
801
801
801
801
801
12001
S. Deviation
1
1
1
1
0.03
0.03
Batch size
10
100
10
100
100
100
Learning Rate
0.01
0.01
0.01
0.01
0.01
0.01
Accuracy
61.2
91.2
75.6
91.4
91.8
93.6
With Dropout and L2 Regularization:
Hidden Units
1024
1024
1024
1024
1024
1024
Iteration
801
801
801
801
801
12001
S. Deviation
1
1
1
1
0.03
0.03
Batch size
10
100
10
100
100
100
Learning Rate
0.01
0.01
0.01
0.01
0.01
0.01
Accuracy
61.2
89.2
76
91
90.6
89.6
Two Hidden Layer Neural Network
Using two layer hidden network with both dropout and L2 and with only dropout.
Dropout + L2
Dropout
Hidden Units
1024 x 1024
1024 x 1024
Iteration
12001
12001
S. Deviation
0.03
0.03
Batch size
100
100
Learning Rate
0.01
0.01
Accuracy
88.8
94.9
Three Hidden Layer Neural Network
Three hidden layer network with learning rate decay. Using both Dropout and L2 regularization but different number of hidden units.
Dropout + L2
Dropout + L2
Hidden Units
1024 x 300 x 50
1024 x 1024 x 105
Iteration
8001
12001
S. Deviation
0.03
0.03
Batch size
128
128
Learning Rate
0.01
0.01
Reg. constant
0.01
0.03
Accuracy
91.4
92
The maximum accuracy I got was 94.9%
from two hidden layer network, you may run it with learning decay and try to add more hidden layers and units. I have seen people getting upto 97.1%
accuracy. I also tried with the three hidden layers but it was not improving the accuracy as you can see above, you might test from your end.