Feature’s Scaling in Machine Learning

Feature Scaling is a technique to standardize the independent features present in the data in a fixed range. It is performed during the data pre-processing to handle highly varying magnitudes or values or units. If feature scaling is not done, then a machine learning algorithm tends to weigh greater values, higher and consider smaller values as the lower values, regardless of the unit of the values.

Asad Mujeeb

3 min readApr 12, 2024

There are four Methods to feature a scaler :

i- MinMaxScaler

ii- Standard Scaler

iii-Robust Scaler

iv- Logarithmic Scaler

MinMax Scaler :

The formula of MinMax scaler is :

scaler = x — min(x) / max(x) — min(x)

# import the libraries
import pandas as pd
from sklearn.preprocessing import MinMaxScaler

#create a dummy data
data = {'values' : [10,20,30,40,50]}

# convert it into dataframe
df = pd.DataFrame(data)

#create  a instance of MinMaxScaler
scaler = MinMaxScaler()

# create a new column and fit the data
df["scaled_value"] = scaler.fit_transform(df["values"].values.reshape(-1,1))

print(df)

ii- Standard Scaler :

x’ = Xi — mean(x) / Xmax — Xmin

# import the libraries
import pandas as pd
from sklearn.preprocessing import StandardScaler

#create a dummy data
data = {'values' : [10,20,30,40,50]}

# convert it into dataframe
df = pd.DataFrame(data)

#create  a instance of StandardScaler
scaler = StandardScaler()

# create a new column and fit the data
df["scaled_value"] = scaler.fit_transform(df["values"].values.reshape(-1,1))

print(df)

iii- Robust Scaler

In this method , we use two statistical measure of data :

i- Median

ii- Inter Quartile Range

Formula :

x’ = x — median(x) / IQR

# import the libraries
import pandas as pd
from sklearn.preprocessing import RobustScaler

#create a dummy data
data = {'values' : [10,20,30,40,50]}

# convert it into dataframe
df = pd.DataFrame(data)

#create  a instance of RobustScaler
scaler = RobustScaler()

# create a new column and fit the data
df["scaled_value"] = scaler.fit_transform(df["values"].values.reshape(-1,1))

print(df)

iv- Logarithmic Scaler

# import the libraries
import pandas as pd
import numpy as np

#create a dummy data
data = {'values' : [10,20,30,40,50]}

# convert it into dataframe
df = pd.DataFrame(data)

# create a new column and fit the data
df["log"] = np.log(df["values"])
df["log2"] = np.log2(df["values"])
df["log10"] = np.log10(df["values"])

print(df)

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Written by Asad Mujeeb

0 Followers

1 Following

Proficient in machine learning algorithms, statistical analysis, and data visualization techniques

No responses yet

Write a response

What are your thoughts?

Also publish to my profile

More from Asad Mujeeb

Asad Mujeeb

Dealing with Inconsistent Data

I am data scientist and i will guide you how to deal with anomalies and inconsistent data. Let’s get started…#Follow

Apr 20, 2024

Asad Mujeeb

Feature Encoding in Machine Learning

Machine learning models can only work with numerical values. For this reason, it is necessary to transform the categorical values of the…

Apr 12, 2024

Predicting Building Energy Efficiency (Supervised Learning)

Asad Mujeeb

Predicting Building Energy Efficiency (Supervised Learning)

You are working for an architecture firm, and your task is to build a model that predicts the energy efficiency rating of buildings based…

Apr 11, 2024

See all from Asad Mujeeb

Recommended from Medium

Basic AI & ML Concepts for MLOps Engineers

Sandip Das

Basic AI & ML Concepts for MLOps Engineers

There’s a lot of misunderstanding (or no understanding at all) of AI & ML , before jumping deep into the world of MLOps, let’s clear those…

Jan 31

Interpreting Support Vector Machine Coefficients: A Comprehensive Analysis

D.H. Jang

Interpreting Support Vector Machine Coefficients: A Comprehensive Analysis

In the rapidly advancing landscape of artificial intelligence (AI) and machine learning (ML), specific methodologies and their…

Nov 3, 2024

Lists

Predictive Modeling w/ Python

20 stories1857 saves

Practical Guides to Machine Learning

10 stories2225 saves

Natural Language Processing

1977 stories1620 saves

The New Chatbots: ChatGPT, Bard, and Beyond

12 stories563 saves

Data Science All Algorithm Cheatsheet 2025

Artificial Intelligence in Plain English

Ritesh Gupta

Data Science All Algorithm Cheatsheet 2025

Stories, strategies, and secrets to choosing the perfect algorithm.

Jan 5

1.4K

Monte Carlo Simulation for Time Series Probabilistic Forecasting

Dataman in AI

Chris Kuo/Dr. Dataman

Monte Carlo Simulation for Time Series Probabilistic Forecasting

Its application on stock market prices

Mar 15, 2024

646

Isolation Forest Algorithm: Critical for Anomaly Detection

ByteWaveNetwork

Isolation Forest Algorithm: Critical for Anomaly Detection

Introduction

Sep 22, 2024

20 Cutting-Edge Statistical Techniques Every Data Scientist Should Master in 2025

The Data Beast

20 Cutting-Edge Statistical Techniques Every Data Scientist Should Master in 2025

In today’s fast-paced data world, traditional methods are evolving rapidly. In 2025, the fusion of classical statistics, AI, and modern…

6d ago

See more recommendations

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams