To measure the spread of a dataset we use standard deviation (sd). A higher standard deviation means numbers are more spread out in the data set while a low standard deviation indicates closer to the mean.
Two types –
- Sample standard deviation.
- Population standard deviation.
The greek letter sigma σ indicates population standard deviation and “s” denotes sample standard deviation.
Formula
Let’s Calculate:
In the real world, we may not need to calculate it manually by hand. However, It is always good to know how it really works.
- First, calculate the mean of the dataset.
- Secondly, find out the variance of a dataset.
- Finally, square root of variance will provide Standard Deviation.
Example:
We have a list of 5 different people’s heights in cm. 174,180,190,195,170. Let’s find the standard deviation for this dataset.
Mean:
The average value of a dataset.
- Sum all data points
- Divide the sum by the size of the dataset.
Our population mean is 181.8 cm
Variance:
Variance means the difference between numbers in a dataset.
- Subtract each datapoints from the mean and sqaure it and get the sum
- Divide the squared sum by the size of the dataset.
The variance for this data set is 17.12 cm
Standard Deviation:
Finally, the answer is 4.41 cm
In the above example, population standard deviation has been calculated. But, If we consider this as a sample data set we need to minus 1 from the sample size. hence the denominator would be 5-1 = 4 and the result would be like below.