I'm trying to make dummy variables in my input set of the following form: My Input set
So I encoded the categorical data so now my array is of the form: Encoded input set
Next, I would like to make dummy variables using OneHot Encoder. I know that it used to work this way:
onehotencoder = OneHotEncoder(categorical_features = [1])
X = onehotencoder.fit_transform(X).toarray()
But now the OneHotEncoder class works a bit different and I can't figure out how to adjust it to my dataset so it works exactly this way. My code:
import numpy as np
import pandas as pd
dataset = pd.DataFrame(
{'RowNumber': [1, 2, 3, 4, 5],
'CustomerId': [602, 311, 304, 354, 888],
'Surname': ['Har', 'Hil', 'Oni', 'Bon', 'Mit'],
'CreditScore': [619, 608, 502, 699, 850],
'Geography': ['FR', 'ES', 'FR', 'FR', 'ES'],
'Gender': ['F', 'F', 'F', 'F', 'F'],
'Age': [42, 41, 42, 39, 43],
'Tenure': [2, 1, 8, 0, 2]})
X = dataset.iloc[:, 3 : -1].values
y= dataset.iloc[:, -1].values
# Encoding categorical data
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
X[:, 1] = le.fit_transform(X[:, 1])
X[:, 2] = le.fit_transform(X[:, 2])
# Making dummy variables
from sklearn.preprocessing import OneHotEncoder
ohe = OneHotEncoder()
Thank you in advance!