Normalization and Standardization Use Case

Case study:

 We Have a used cars dataset from the website.

  • This dataset contains information about used cars.
  • This data can be used for a lot of purposes such as price prediction to exemplify the use of linear regression in Machine Learning.
  • The columns in the given dataset are as follows: name, year, selling_price, km_driven, fuel, distance, seller_type, transmission, Owner

For used motorcycle datasets please go to https://www.kaggle.com/nehalbirla/motorcycle-dataset


Here using the above features we should predict the selling price of cars. so feature km_driven and distance are in different scaling if we load these features into a model then prediction may go wrong due to the wrong interpretation of slops.

To overcome these we will scale down these features into normal values between 0 to 1.


from sklearn.preprocessing import MinMaxScaler
Minscaler = MinMaxScaler()
scaler = Minscaler.fit('distance', 'km_driven')
scaler.data_min_
scaler.data_max_
X_scaled.describe()

Here SKlearn has a module called MinMaxScaler, we are creating an instance of MinMaxscaler
as Minscaler and applying a fit function to distance and km_driven features so that it
will convert all the values between 0 to 1 by applying the Normalization formula.

Then check the minimum and maximum value by validating through describe function.

Comments

Popular posts from this blog

Map() vs Apply() vs ApplyMap() Functions

Normalization and Standardization

Continuous vs Categorical Variable