Why use arrays of shape (x,) rather than (x,1)?

Question

I've recently encountered a couple of bugs due to numpy arrays being in the shape (x,) - these can easily be fixed by the snippet below

a = np.array([1,2,3,4]) #this form produced a bug
a.shape 
>>> (4,)  
a.shape = [4,1] #but this change fixed it

But it does make me wonder, why is (x,) the default shape for 1D arrays?

1D arrays are supposed to be 1-dimensional, but `(x, 1)` or `(1, x)` are 2D arrays. They have two dimensions, one of them set to 1. Can you be more specific in what type of bug you encounter? I suspect these bugs might actually be features :) — kazemakase, Apr 03 '17 at 12:22

kasravnd · Accepted Answer · 2017-04-03T12:29:19.380

3

Each item in shape's tuple denotes an axis. When you have one item in it means your array is 1 dimensional (1 axis) otherwise it will be a 2D array. When you do a.shape = [4,1] you're simply converting your 1D array to 2D:

In [26]: a = np.array([1,2,3,4])
In [27]: a.shape = [4,1]

In [28]: a.shape        
Out[28]: (4, 1)

In [29]: a
Out[29]: 
array([[1],
       [2],
       [3],
       [4]])

edited Apr 03 '17 at 12:29

answered Apr 03 '17 at 12:24

kasravnd

94,640
16
137
166

The 'wordy' display of the (4,1) array is a good reason not to make it the default for 'vectors'. – hpaulj Apr 03 '17 at 15:51

score 1 · Answer 2 · answered Apr 03 '17 at 12:40

I suspect this question is coming up because you have come from a Matlab background in which everything is treated as a matrix. In Matlab all 1D data sets are treated either as a row or a column vector and the indexing is short-circuited so that specifying a single index treats both as 1D lists.

Numpy does not deal with matrices, per se, but rather with nested lists. Lists of lists have a similar interpretation to the matrices of Matlab, but there are key differences. For instance, Numpy will not make any assumptions about which element you mean if you only give it a single index, the indexing always acts the same regardless of the depth of the nested lists.

import numpy as np

arr = np.array([1, 2, 3, 4])
print(arr)
>> [1 2 3 4]
print(arr[0])
>> 1

arr.shape = [4, 1]
print(arr)
>> [[1]
>>  [2]
>>  [3]
>>  [4]]
print(arr[0])
>> [1]

arr.shape = [1, 4]
print(arr)
>> [[1 2 3 4]]
print(arr[0])
>> [1 2 3 4]

score 0 · Answer 3 · answered Apr 03 '17 at 13:08

Quoting from the documentation:

shape : tuple of ints
The elements of the shape tuple give the lengths of the corresponding array dimensions.

So, when you have the shape like (4, ) it means that its first dimension has 4 elements in it. It makes sense right as from your example?

On the contrary, if we have the shape, as you say, as (4, 1), then it means that first dimension (axis=1, in NumPy lingo) has 4 elements and second dimension (in NumPy lingo, axis=0) has 1 element in it which is not the case (for 1D arrays)

Why use arrays of shape (x,) rather than (x,1)?

3 Answers3