25

Could anyone help with an an explanation of what axis is in TensorFlow's one_hot function?

According to the documentation:

axis: The axis to fill (default: -1, a new inner-most axis)

Closest I came to an answer on SO was an explanation relevant to Pandas:

Not sure if the context is just as applicable.

Maxim
  • 47,916
  • 23
  • 132
  • 189
user919426
  • 6,744
  • 9
  • 41
  • 73

3 Answers3

22

Here's an example:

x = tf.constant([0, 1, 2])

... is the input tensor and N=4 (each index is transformed into 4D vector).

axis=-1

Computing one_hot_1 = tf.one_hot(x, 4).eval() yields a (3, 4) tensor:

[[ 1.  0.  0.  0.]
 [ 0.  1.  0.  0.]
 [ 0.  0.  1.  0.]]

... where the last dimension is one-hot encoded (clearly visible). This corresponds to the default axis=-1, i.e. the last one.

axis=0

Now, computing one_hot_2 = tf.one_hot(x, 4, axis=0).eval() yields a (4, 3) tensor, which is not immediately recognizable as one-hot encoded:

[[ 1.  0.  0.]
 [ 0.  1.  0.]
 [ 0.  0.  1.]
 [ 0.  0.  0.]]

This is because the one-hot encoding is done along the 0-axis and one has to transpose the matrix to see the previous encoding. The situation becomes more complicated, when the input is higher dimensional, but the idea is the same: the difference is in placement of the extra dimension used for one-hot encoding.

Maxim
  • 47,916
  • 23
  • 132
  • 189
  • 1
    Thank-you for explaining, though it's very dense for me to grasp, so will ask one at a time. How do you figure that `x = tf.constant([[1, 1, 2], [0, 1, 2]])` yields a `4D` vector?...is it that they are it is an array of 2, 2D arrays as elements? – user919426 Jan 03 '18 at 18:49
  • I think I need to read. I have too many questions. Would it be possible to simplify your answer with a 2D array?...failure of which, I think, I would have to revert back to first principles...with no doubt at all that your answer could be correct – user919426 Jan 03 '18 at 18:53
  • It's ok. I tried to choose different dimensions for everything to avoid misunderstanding. Input `x` is `(2, 3)`. Encoding result is 4D, because we set `N=4` (think the number of classes). That's why the result is `(2, 3, 4)` or `(4, 3, 2)`, depending on the placement. – Maxim Jan 03 '18 at 18:54
  • Actually, yes, you can see the difference with `x = tf.constant([0, 1, 2])` as well: the result is either `(3, 4)` or `(4, 3)` – Maxim Jan 03 '18 at 18:55
  • Haha. I am going back to read from first principles. I absolutely do not get it. Nothing to do with your answer, but my lack of depth on the subject. Thank-you, either way. Will mark it as an answer trusting your knowledge on the subject...then I shall be back to re-read :) – user919426 Jan 03 '18 at 19:19
  • No problem. Take your time, I've updated the answer with more simple example – Maxim Jan 03 '18 at 19:21
20

For me axis translates to "where do you add the additional numbers to increase the dimension". At least this is how I am interpreting it and serves as a mnemonic.

For instance you have [1,2,3,0,2,1] and this is of shape (1,6). Which means it's a one dimension array. one_hot adds zeros and transform the position to a 1 in every position of your original array, for this to happen the original array must have 1 more dimension than the original array and axis tells the function where to add it, this new dimension will identify the examples.


axis=1

You add a second dimension and the first dimension is kept. This would result in a (6,4) array. So for the resulting array, you use the first dimension (0) to know which example you see and the second dimension (1, the new one) to know if that class is active. newArr[0][1]=1 means example 0, class 1, which in this case means example 0 is of class 1.
   0   1   2   3  <- class

[[ 0.  1.  0.  0.]   <- example 0
 [ 0.  0.  1.  0.]   <- example 1
 [ 0.  0.  0.  1.]   <- example 2
 [ 1.  0.  0.  0.]   <- example 3
 [ 0.  0.  1.  0.]   <- example 4
 [ 0.  1.  0.  0.]]  <- example 5

axis=0

You add a first dimension and the existing dimension is shifted. This would result in a (4,6) array. So for the resulting array, you use the first dimension (0, the new dimension) to know if that class is active and the second dimension (1) to know which example you see. newArr[0][1]=0 means class 0, example 1, which in this case means example 1 is not of class 0.
   0   1   2   3   4   5  <- example

[[ 0.  0.  0.  1.  0.  0.]   <- class 0
 [ 1.  0.  0.  0.  0.  1.]   <- class 1
 [ 0.  1.  0.  0.  1.  0.]   <- class 2
 [ 0.  0.  1.  0.  0.  0.]]  <- class 3
loco.loop
  • 812
  • 1
  • 9
  • 18
0

For me, I understand it in the following way - (Note indices referred to in documentation is just the info of labels of classes, it can be a scalar or vector or a matrix) if your indices is just a scalar, an axis is not required. if however, it is a vector, you have a choice of the orientation of features and classes 2` Here the image of the one-hot vector has a row as depth(class) and a column as the respective feature(label) so for this case axis has the value 0. Similarly, if you want features x depth then the axis will have a value of -1.

Likewise, if the indices is a matrix then you have the following choice of orientations


(batch means the rows in your indices) 3

batch x features x depth if axis == -1
batch x depth x features if axis == 1
depth x batch x features if axis == 0
Thanator
  • 1
  • 1