Tuning parameters to improve Accuracy of Neural Network

Dec 31, 2016

Parameter tuning is one of the toughest task in major Machine Learning model. Some times it work with a hit and trial method and sometimes need a lot to think to make the model work better. In this tutorial I will show how changing the parameters directly affect the accuracy of Neural Network. You might see the parameters that affect the accuracy.

You will see how the accuracy increases with the change of certain parameters. I have tested a lot on these by varying the parameters but again there can be alot more work and I would like you to do and share with the Tensorflow hub community.

Regularization constant used here is 0.01 if not mentioned explicitly.

Note: Mobile Users use Landscape view for better experience.

Single Hidden Layer Neural Network

LearningRate = 0.01

With no Regularization:

Hidden Units

1024

Iteration

801

Standard Deviation

Batch size

100

Accuracy

57.6

91.2

With L2:

Hidden Units

1024

Iteration

801

Standard Deviation

Batch size

100

Accuracy

75.6

91.4

With Dropout:

Here I have also tested with more iteration since dropout works well with more steps.

Hidden Units

1024

Iteration

801

12001

S. Deviation

0.03

Batch size

100

Learning Rate

0.01

Accuracy

61.2

91.2

75.6

91.4

91.8

93.6

With Dropout and L2 Regularization:

Hidden Units

1024

Iteration

801

12001

S. Deviation

0.03

Batch size

100

Learning Rate

0.01

Accuracy

61.2

89.2

90.6

89.6

Two Hidden Layer Neural Network

Using two layer hidden network with both dropout and L2 and with only dropout.

Dropout + L2

Dropout

Hidden Units

1024 x 1024

Iteration

12001

S. Deviation

0.03

Batch size

100

Learning Rate

0.01

Accuracy

88.8

94.9

Three Hidden Layer Neural Network

Three hidden layer network with learning rate decay. Using both Dropout and L2 regularization but different number of hidden units.

Dropout + L2

Hidden Units

1024 x 300 x 50

1024 x 1024 x 105

Iteration

8001

12001

S. Deviation

0.03

Batch size

128

Learning Rate

0.01

Reg. constant

0.01

0.03

Accuracy

91.4

The maximum accuracy I got was 94.9% from two hidden layer network, you may run it with learning decay and try to add more hidden layers and units. I have seen people getting upto 97.1% accuracy. I also tried with the three hidden layers but it was not improving the accuracy as you can see above, you might test from your end.