Normalization and Standardization
Suppose if you have any use case, so the most important thing for the use case is data. Initially, you will be collecting the data so if you have collected the data that data have many features so those features may contain independent feature and dependent feature so with the help of the independent we will try to predict dependent feature in supervised machine learning.
so when you consider these features this has 2 important properties.
1. Unit
2. Magnitude
let's have features like personage, height, weight, etc. so if I consider the feature age the unit basically no of years and the magnitude is basically value.
For Ex: Suppose if I say 25years then 25 is magnitude and years is unit.
Each feature is calculated with unit and magnitude so if you have many features so it will get computed with different units. so this unit and magnitude vary between different features. so it is very necessary that for the machine learning algorithm the data we provide that we should try to scale down the data with a particular scaling value.
For this type of problem, we use 2 main techniques.
1. Normalization: Normalization helps you to scale down your feature between 0 to 1.
2. Standardization: Standardization helps you to scale down your feature based on a standard normal distribution. Usually(Mean is 0 and Standard deviation is 1).
lets us discuss Normalization and Standardization.
Normalization(Min-Max normalization)
In this approach we will scale down the values of features between 0 to 1.
X norm = (X - X min) /(X max - X min)
from sklearn.preprocessing import MinMaxScaler
scaling = MinMaxscaler()
Comments
Post a Comment