Feature Encoding in Machine Learning

Apr 12, 2024

Machine learning models can only work with numerical values. For this reason, it is necessary to transform the categorical values of the relevant features into numerical ones. This process is called feature encoding. Data frame analytics automatically performs feature encoding.There are three ways to do so :

i- One Hot Encoding

import pandas as pd

data = {“colors” : [“red”, “green”, “blue”, “red”]}

df = pd.DataFrame(data)

encoded_data = pd.get_dummies(df, columns = [“colors” ])

print(encoded_data)

ii- Label Encoder

import pandas as pd
from sklearn.preprocessing import LabelEncoder

data = {“Animals” : [“dog”, “cow”, “lion”, “crow”, “sparrow”]}

df = pd.DataFrame(data)

label_encoder = LabelEncoder()

df[“Animal encoded”] = label_encoder.fit_transform(df[“Animals”])

print(df)

iii-Ordinal Encoding

from sklearn.preprocessing import OrdinalEncoder
# Sample data
data = {‘Size’: [‘Small’, ‘Medium’, ‘Large’, ‘Medium’]}
df = pd.DataFrame(data)

# Ordinal Encoding
ordinal_encoder = OrdinalEncoder(categories=[[‘Small’, ‘Medium’, ‘Large’]])
df[‘Size_encoded’] = ordinal_encoder.fit_transform(df[[‘Size’]])
print(df)

Feature Encoding in Machine Learning

Written by Asad Mujeeb

No responses yet