Bias And Variance

Machine Learning especially deep learning is all about the model and data that we feed to the model. There are 2 things to consider.
* Model Complexity.
* Number of hidden layers.
* Number of nodes in the hidden layers.
* Data Provided to the model
* Number of features in the data( data points).
* Number of the row of data
There are few other things like learning rate, regularisation, dropout etc, these are the things which mostly related to how the model should learn. In order to make a good model, we have had to get the same database on which we have to tune the model so that the end error is minimum. To tune the model we often focus on the bias and the Variance, but we should have a knowledge about this two concept in order to fine-tune the model.

Lets us first walk through a simple story

Suppose you start learning an instrument, and we started with the one of the simple song A. Initially you will not be able to play that particular song correctly but after practicing many times you will be able to play it correctly. Suppose it has been a week and you can play song A perfectly, but thus this means you have learned to play?
Now your instructor ask you to play some other song B, when you start playing song B, you were not able to play is correct as you have practiced the song A only, you have not studied how in general music has to be played, but so have only perfected the single song.
Now let’s compare this story with the BIAS and VARIANCE.

BIAS
When you start learning the instrument, mistake ( error ) you have done while learning that particular song is the measure of BIAS. In simple term error proportionate to the specific dataset.
Higher the error, higher is the Bias. How well our model is performing, assuming dataset is static, and we have to predict it for the given dataset only as of now.

VARIANCE
When you have learned to play the song A, but the mistake (error) you have done to while playing song B is the measure of the VARIANCE. In simple term error proportionate when there is a change in the dataset.
How well we can predict when our model sees the new dataset which it had not to seem earlier.

So whenever we start with any model, usually we will have high error, model is not able to predict the correct value (HIGH BIAS), but once we add more layers or we add more feature our model is able to predict the current value for the given dataset (LOW BIAS).
Now we will need some new data to the mode, it will have the high error for the new data set(High Variance), but if we make our model generic, not too much dependent on the initial data, it will predict the correct value for new dataset also (Low Variance).

So our end result will be to obtain low bias and low variance. But what metrics we should observe and what actions can be taken to achieve this has to be sorted.
To start with one should always create an error graph for both training set and validation set against time.

 

Bias
Variance
In the graph of error vs time, both the training error and validation will be high. It means we have high Bias
In the graph of error vs time, the training error is decreasing but validation will be high. It means we have high Variance.
High Bias is often known as underfitting.
High Variance often knows as overfitting.
To solve High Bias one can do the following
  • Add more features.
  • Increase hidden layers or increase the number of neural in the layers.
  • Train Model for the longer time.
To solve High Variance one can do the following
  • Add more data in the training set
  • Use regularization
  • Decrease the learning rate

 

Note
High Bias Low Variance: Models are consistent but inaccurate on average
High Bias High Variance: Models are inaccurate and also inconsistent on average
Low Bias High variance: Models are somewhat accurate but inconsistent on averages. A small change in the data can cause a large error.

 


 

Some Important Links to understand Bias and Variance
* Andrew NG coursera discussion
* Quora Discussion