Z Score - Normal Distribution
-> We are going to have a deep discussion on the Z score. But right before that, we need to understand what a normal distribution and a standard normal distribution are.
What is a distribution?
A distribution in statistics is a function that shows the possible values for a variable and how often they occur. it may occur with various different values like age, height, the weight of people.
What is a Normal distribution?
The normal distribution is a distribution that is symmetric about the mean(mean is nothing but average of all the observations). Most of the observations in the normal distribution are surrounded by the mean.
What is a standard normal distribution?
The standard normal distribution is a normal distribution whose mean and standard deviation are scaled at 0 and 1 respectively.
Z score can only be calculated for the observations which follow a normal distribution.
What is a Z score?
A Z-score is a numerical measurement that describes a value’s relationship to the mean of a group of values. Z-score is measured in terms of standard deviations from the mean.
Example
Suppose there are three students whose marks in their English examination are 12, 16, and 23. The mean is 17.
What is Standard deviation?
Standard deviation is a quantity expressed by how much the members of a group differ from the mean value for the group. In the above example, the mean is 17 and the observations are 12, 16, and 23.
How to calculate the Z score?
z = (data point — mean) / standard deviation
Z score Applications:
1. Standardization
we use z score in standardization in machine learning technique to convert normal distribution into a standard normal distribution with mean=0 and standard deviation=1.
Ex: if you have features like age(yrs), height(inch), weight(kg) with different scaling we will convert all features into a normal standard scale with standardization.
2. Z score Helps you to compare scores between different distribution
if we have 2 scores with different distribution z score will help to identify which is the best one among them.
Ex: if person 1 scores 76 in English with a mean of 72 and std 1.2 and person 2 scores 80 in maths with a mean of 79 and std 1.5, can you tell who is better among their scores?
Z score will help to get a better score among various scores.
3. Outlier detection
Z scores can also be used for outlier detection. If I did forget to mention above, if the Z score is less than -3 or greater than 3, That observation might be considered as an outlier.
What is an outlier?
Outlier is a value that differs significantly from other values in the data.
Below links will give a good understanding of the Z score:
1. https://www.khanacademy.org/math/ap-statistics/density-curves-normal-distribution-ap/measuring-position/a/z-scores-problem
2. https://www.youtube.com/watch?v=4Fta6KQ1QHQ
3. https://www.youtube.com/watch?v=MicmZlGfGJg
Comments
Post a Comment