Note: This technique is good if you are not interested in converting them back.
from sklearn import preprocessing
le = preprocessing.LabelEncoder()
for i in range(0,X.shape[1]):
if X.dtypes=='object':
X[X.columns] = le.fit_transform(X[X.columns])
df.apply(LabelEncoder().fit_transform)
from sklearn.preprocessing import LabelEncoder
import numpy as np
import pandas as pd
CountryDF = pd.DataFrame([['CN_Milk powder_Incl_Others',np.nan,'Shanghai Hyper total','O.Brand',np.nan,np.nan,'Hi Cal Adult Milk Powders- C1'],
['CN_Milk powder_Incl_Others','Elder','Shanghai Hyper total','O.Brand',np.nan,np.nan,'Hi Cal Adult Milk Powders- C1'],
['CN_Milk powder_Incl_Others','Others','Shanghai Hyper total','O.Brand',np.nan,np.nan,'Hi Cal Adult Milk Powders- C1'],
['CN_Milk powder_Incl_Others','Lady','Shanghai Hyper total','O.Brand',np.nan,np.nan,'Hi Cal Adult Milk Powders- C1'],
['CN_Milk powder_Incl_Others',np.nan,'Shanghai Hyper total','O.Brand','S_B1',np.nan,'Hi Cal Adult Milk Powders- C1'],
['CN_Milk powder_Incl_Others',np.nan,'Shanghai Hyper total','O.Brand','S_B2',np.nan,'Hi Cal Adult Milk Powders- C1']],
columns=['Database','Target','Market_Description','Brand','Sub_Brand', 'Category','Class_Category'])
First, initialize the LabelEncoder, then fit and transform the data (while assigning the transformed data to a new column).
le = LabelEncoder() # initialze the LabelEncoder once
#Create a new column with transformed values.
CountryDF['EncodedTarget'] = le.fit_transform(CountryDF['Target'])
Notice, the last column, EncodedTarget is a transformed copy of Target.
CountryDF
Database Target Market_Description Brand Sub_Brand Category Class_Category EncodedTarget
0 CN_Milk powder_Incl_Others NaN Shanghai Hyper total O.Brand NaN NaN Hi Cal Adult Milk Powders- C1 0
1 CN_Milk powder_Incl_Others Elder Shanghai Hyper total O.Brand NaN NaN Hi Cal Adult Milk Powders- C1 1
2 CN_Milk powder_Incl_Others Others Shanghai Hyper total O.Brand NaN NaN Hi Cal Adult Milk Powders- C1 3
3 CN_Milk powder_Incl_Others Lady Shanghai Hyper total O.Brand NaN NaN Hi Cal Adult Milk Powders- C1 2
I hope this helps clear up LabelEncoder.
If you have numerical and categorical both type of data in dataframe You can use : here X is my dataframe having categorical and numerical both variables
from sklearn import preprocessing
le = preprocessing.LabelEncoder()
for i in range(0,X.shape[1]):
if X.dtypes=='object':
X[X.columns] = le.fit_transform(X[X.columns])
Note: This technique is good if you are not interested in converting them back.