How do I apply label encoding on multiple columns?

QBoard » Artificial Intelligence & ML » AI and ML - Python » How do I apply label encoding on multiple columns?

User Dashboard

How do I apply label encoding on multiple columns?

Back To Topics

Tags : python datscience

Sindhuja Martha

181

How do I apply label encoding on multiple columns?

August 13, 2021 3:49 PM IST

0
Samar Patil

346 3
If you have numerical and categorical both type of data in dataframe You can use : here X is my dataframe having categorical and numerical both variables
```
from sklearn import preprocessing
le = preprocessing.LabelEncoder()

for i in range(0,X.shape[1]):
    if X.dtypes=='object':
        X[X.columns] = le.fit_transform(X[X.columns])
```
August 21, 2021 11:58 AM IST

0
Ananthesh Kunjathaya B R

13
```
df.apply(LabelEncoder().fit_transform)
```
August 24, 2021 11:47 PM IST

0

Viaan Prakash

461

I don't think the large dataset is affecting your outcome. The purpose of LabelEncoder is to transform the prediction targets (In your case, I'm assuming, the Target column). From the User Guide:

LabelEncoder is a utility class to help normalize labels such that they contain only values between 0 and n_classes-1.

Here's an example, notice I changed the values of Target in your example CountryDF, just for demonstration purposes:

from sklearn.preprocessing import LabelEncoder
import numpy as np
import pandas as pd

CountryDF = pd.DataFrame([['CN_Milk powder_Incl_Others',np.nan,'Shanghai Hyper total','O.Brand',np.nan,np.nan,'Hi Cal Adult Milk Powders- C1'],
                              ['CN_Milk powder_Incl_Others','Elder','Shanghai Hyper total','O.Brand',np.nan,np.nan,'Hi Cal Adult Milk Powders- C1'],
                              ['CN_Milk powder_Incl_Others','Others','Shanghai Hyper total','O.Brand',np.nan,np.nan,'Hi Cal Adult Milk Powders- C1'],
                              ['CN_Milk powder_Incl_Others','Lady','Shanghai Hyper total','O.Brand',np.nan,np.nan,'Hi Cal Adult Milk Powders- C1'],
                             ['CN_Milk powder_Incl_Others',np.nan,'Shanghai Hyper total','O.Brand','S_B1',np.nan,'Hi Cal Adult Milk Powders- C1'],
                             ['CN_Milk powder_Incl_Others',np.nan,'Shanghai Hyper total','O.Brand','S_B2',np.nan,'Hi Cal Adult Milk Powders- C1']],
                            columns=['Database','Target','Market_Description','Brand','Sub_Brand', 'Category','Class_Category'])

First, initialize the LabelEncoder, then fit and transform the data (while assigning the transformed data to a new column).

le = LabelEncoder() # initialze the LabelEncoder once

#Create a new column with transformed values.
CountryDF['EncodedTarget'] = le.fit_transform(CountryDF['Target'])

Notice, the last column, EncodedTarget is a transformed copy of Target.

CountryDF

Database    Target  Market_Description  Brand   Sub_Brand   Category    Class_Category  EncodedTarget
0   CN_Milk powder_Incl_Others  NaN     Shanghai Hyper total    O.Brand     NaN     NaN     Hi Cal Adult Milk Powders- C1   0
1   CN_Milk powder_Incl_Others  Elder   Shanghai Hyper total    O.Brand     NaN     NaN     Hi Cal Adult Milk Powders- C1   1
2   CN_Milk powder_Incl_Others  Others  Shanghai Hyper total    O.Brand     NaN     NaN     Hi Cal Adult Milk Powders- C1   3
3   CN_Milk powder_Incl_Others  Lady    Shanghai Hyper total    O.Brand     NaN     NaN     Hi Cal Adult Milk Powders- C1   2

I hope this helps clear up LabelEncoder.

August 14, 2021 12:49 PM IST

Tarun Reddy

84
If you have numerical and categorical both type of data in dataframe You can use : here X is my dataframe having categorical and numerical both variables
```
from sklearn import preprocessing
le = preprocessing.LabelEncoder()

for i in range(0,X.shape[1]):
    if X.dtypes=='object':
        X[X.columns] = le.fit_transform(X[X.columns])
```
Note: This technique is good if you are not interested in converting them back.
August 14, 2021 10:07 PM IST

0

Cluzters.ai

Cluzters.ai is the first step towards uniting various Industry participants in the field of Applied Data Innovations. It is a gamified community geared towards creating a level playing turf for Data science professionals.

Member Sign In

Member Sign In

Create Account

How do I apply label encoding on multiple columns?

Connect With Us