Specify the length of the one-hot encoder in scikt

Question

In this case, the output of

from sklearn.preprocessing import OneHotEncoder
data = [[1], [3], [5]]
encoder = OneHotEncoder(sparse=False)
encoder.fit(data)
print(encoder.fit_transform(data))

is

[[1. 0. 0.] [0. 1. 0.] [0. 0. 1.]]

Is it possible to get the output?

[[1. 0. 0. 0. 0.] [0. 0. 1. 0. 0.] [0. 0. 0. 0. 1.]]

score 0 · Answer 1 · answered Jan 29 '20 at 12:31

This is not what you asked, but the output matches with your provided data. This code is modified version of, How can I one hot encode in Python?.

Code:

import numpy as np


data = [[1], [3], [5], [5], [3]]
#data = [[1], [3], [5]]
#data = [[1], [3], [2], [1]]


max_data = np.amax(data)
print(max_data)
print()


## nb_classes at least one more than max
nb_classes = max_data + 1


def indices_to_one_hot(data, nb_classes):
    targets = np.array(data).reshape(-1)
    return np.eye(nb_classes)[targets]

arr = indices_to_one_hot(data, nb_classes)
print(arr)
print()


ans = []
for i in range(0, len(data)):
    ans.append(np.delete(arr[i], 0))

print(np.array(ans))

Output:

5

[[0. 1. 0. 0. 0. 0.]
 [0. 0. 0. 1. 0. 0.]
 [0. 0. 0. 0. 0. 1.]
 [0. 0. 0. 0. 0. 1.]
 [0. 0. 0. 1. 0. 0.]]

[[1. 0. 0. 0. 0.]
 [0. 0. 1. 0. 0.]
 [0. 0. 0. 0. 1.]
 [0. 0. 0. 0. 1.]
 [0. 0. 1. 0. 0.]]

Specify the length of the one-hot encoder in scikt

1 Answers1