0
Car Model Mileage Sell Price($) Age(yrs)
0 BMW X5 69000 18000 6
1 BMW X5 35000 34000 3
2 BMW X5 57000 26100 5
3 BMW X5 22500 40000 2
4 BMW X5 46000 31500 4
5 Audi A5 59000 29400 5
6 Audi A5 52000 32000 5
7 Audi A5 72000 19300 6
8 Audi A5 91000 12000 8
9 Mercedez Benz C class 67000 22000 6
10 Mercedez Benz C class 83000 20000 7
11 Mercedez Benz C class 79000 21000 7
12 Mercedez Benz C class 59000 33000 5

Above is my data frame and I want to Encode "Car Model" using one hot encoder.

  • Does this answer your question? [How can I one hot encode in Python?](https://stackoverflow.com/questions/37292872/how-can-i-one-hot-encode-in-python) – yudhiesh Jan 31 '21 at 06:27
  • @yudhiesh I tried locking up few examples like these but could not fit the same in may code. I am beginner and hence have lots of confusion. – Diksha Nasa Feb 02 '21 at 04:07
  • Looking at your question again, what do you mean by not using `categorial_features`? – yudhiesh Feb 02 '21 at 05:09
  • @yudhiesh categorical_features is an attribute used to declare which column to encode for the one hot encoder, while watching tutorials I found a lot of people using it but it throws an error now as it is currently removed from the latest python . – Diksha Nasa Feb 02 '21 at 05:32
  • I have added in an answer without using `scikit-learn`. – yudhiesh Feb 02 '21 at 05:40

2 Answers2

0

You can one-hot encode the categorical variables like so.

one_hot_data = pd.get_dummies(df.Model, prefix='Model')
df = pd.concat([df, one_hot_data], axis=1)

Edit:

import pandas as pd
from sklearn.preprocessing import LabelBinarizer

jobs_encoder = LabelBinarizer()
jobs_encoder.fit(df['Model'])
transformed = jobs_encoder.transform(df['Model'])
ohe_df = pd.DataFrame(transformed)
df = pd.concat([df, ohe_df], axis=1).drop(['Model'], axis=1)
yudhiesh
  • 3,560
  • 3
  • 7
  • 24
0

I solved it by using the following code:

from sklearn.preprocessing import LabelEncoder
le=LabelEncoder()
dfle['Car Model']=le.fit_transform(dfle['Car Model'])
ohe=OneHotEncoder()
dfle['Car Model']=pd.DataFrame(ohe.fit_transform(df[['Car Model']]).toarray())
dfle.head()
X=dfle[['Car Model','Mileage','Age(yrs)']].values