The concept of dropout in deep learning, in particular deep networks, accompanied by research to see if the depletion of the deep network on a typical dataset directly affects the practical situation and how this impacts.

  1. What is Dropout in Neural Networks?
  2. Why do we need Dropout?
  3. Dropout — Revisited
  4. What is dropout layer?
  5. Experiment in Keras

1. What is Dropout in Neural Networks?

The word ‘dropout’ refers to the discharge of units in a neural network (both secret and visible).

In simple terms, a dropout means that those neurons, which are selected at random, are ignored by the units (i.e. neurons). By ‘not knowing’ this means that during a certain forward or backward pass these units are not considered.

More technical, each stage of the training involves either dropping individual nodes 1-p out of the net or keeping them at probability p such that the reduced network remains.

The neural network idea is inspired by human brain neurons, and scientists needed a system to replicate the process. This sought a direction to one of the artificial intelligence’s most important topics. A neural network (NN) is based on an artificial neural neuron set of connected units or nodes which loosely model the neurons in a biological brain. As this network is artificially built into computers, we call it Artificial Neural Networks (ANN).

2. Why do we need Dropout?

As we know a little about dropout, there is a concern — why do we need to drop out in some way? Why do we have to close down neural network sections literally?

“To avoid excessive use,” is the response to these concerns.

There are several of the parameters of a completely connected layer and thus neurons establish co-dependency during training, which reduces the independent capacity of each neuron and over-adjusts the training data.         

3. Dropout — Revisited

Let’s now go into some depth, since we know a little bit of dropout and inspiration. The two above parts would be appropriate if you simply needed an explanation of drop-out in neural network.

Regularization is a means of avoiding overfitting dropout in machine learning. By applying a penalty to the loss function, regularisation eliminates over-fitting. 

The model is trained so that interdependent feature weights are not learned from the addition of the penalty. If you have knowledge of logistic regression, you can know the penalties L1 (Laplacian) and L2 (Gaussian).

Drop-out is a regularisation technique in neural networks that reduces interdependent neuronal learning.

4. What is dropout layer?

  • Training Phase:

Ignore a random fraction (zero out), p, of nodes for each hidden layer, for each training sample, for each iteration (and corresponding activations).

Parameter p, first of all, we should now remember that it is adjustable, and the drop-out machine learning engineer has to be set up front. It is adjustable and leads to the same errors that are necessary for setting learning rates: they just don’t know which p is the best for the data. And they found such a pattern: a p-to-0.5 value for the hidden layers appears to result in the most effective application of Dropout over many scenarios. This applies to all layers except the one input where p must be equal to 1.0

  • Testing Phase

Use all dropout layer deep learning activations, but minimise by a factor p (to account for the missing activations during training).

Some observations

  • Dropout causes a neural network to acquire more strong characteristics that are helpful for several various altered neuron subsets.
  • The number of iterations needed to converge is approximately doubled. Training times are however lower for each epoch.
  • We have 2^H potential models of H secret units that can be lowered each. The whole network is included in the testing process and each activation factor p is decreased.

5. Experiment in Keras

In theory, let’s understand dropout in Keras example. We designed a deep net in Keras and tried to validate this using the CIFAR-10 dataset to see how drop-out is working. The deep network was constructed with three layers of 64, 128 and 256 sizes, followed by two densely connected 512 layers and a dense 10 layer of output (number of classes in the CIFAR-10 dataset).

We have taken ReLU as the hidden layer triggering feature and sigmoid for the output layer (these are norms and they have not been changed much). In addition, the traditional cross-entropy category loss was used.

Finally, in both layers, we used drop-outs and increased the drop-out fraction from 0.0 (no drop-outs) to 0.9 with phase sizes of 0.1. The conclusions are as follows:

The above diagrams show that with the rise in the drop-off, validation precision is increased elsewhere and the loss is reduced first until the pattern begins to decrease.

If the dropout fraction is 0.2 there may be two explanations for this tend down: 0.2 for this dataset, network and the fixed parameters used is the real minimum

More times are needed for networking training.


In this article, we learned about drop out, the dropout layer in neural network, the phases of drop out and an experiment in Keras.

There are no right or wrong ways of learning AI and ML technologies – the more, the better! These valuable resources can be the starting point for your journey on how to learn Artificial Intelligence and Machine Learning. Do pursuing AI and ML interest you? If you want to step into the world of emerging tech, you can accelerate your career with this Machine Learning And AI Courses by Jigsaw Academy.



Are you ready to build your own career?