Yes, there is such a thing as a numpy
scalar
https://numpy.org/doc/stable/reference/arrays.scalars.html
A numpy array can have 0,1,2 or more dimensions. There's a lot of overlap between
np.int64(3) # numpy int
np.array(3) # 0d array
np.array([3]) # 1d array with 1 element
np.int(3) # python int
3 # python int
The first 3 have array attributes like shape
and dtype
. The differences between the first two are minor.
In a function like where
, numpy
first converts the arguments to array, e.g. np.array(5)
, np.array(1)
In [161]: np.where(5, 1, 0)
Out[161]: array(1)
In [162]: _.shape
Out[162]: ()
In [163]: np.array(5)
Out[163]: array(5)
But math like addition with a scalar may return a numpy scalar:
In [164]: np.array(5) + 1
Out[164]: 6
In [165]: type(_)
Out[165]: numpy.int64
In [166]: np.array(5) * 1
Out[166]: 5
In [167]: type(_)
Out[167]: numpy.int64
Indexing an array can also produce such a scalar:
In [182]: np.arange(3)[1]
Out[182]: 1
In [183]: type(_)
Out[183]: numpy.int64
where
'broadcasts' the arguments, so the resulting shape is, in the broadcasted sense, the "largest":
In [168]: np.where(np.arange(5),1,0)
Out[168]: array([0, 1, 1, 1, 1])
In [173]: np.where(5, [1],0)
Out[173]: array([1])
In [174]: np.where(0, [1],0)
Out[174]: array([0])
In [175]: np.where([[0]], [1],0)
Out[175]: array([[0]])
If spyder
has tab completion like ipython
, you can get a list of all the methods attached to an object. The methods for an np.int64(3)
will look a lot like the those for np.array(3)
. But very different from 3
.
There are also arrays with 0 elements - if one of the dimensions is 0
Out[184]: array([], dtype=int64)
In [185]: _.shape
Out[185]: (0,)
In [186]: np.arange(1)
Out[186]: array([0])
In [187]: _.shape
Out[187]: (1,)
Obviously a 0d can't have 0 elements, because it doesn't have any 0 dimensions.
Indexing a 0d array (or numpy scalar) is a bit tricker (but still logical):
In [189]: np.array(3)[()] # 0 element indexing tuple
Out[189]: 3
In [190]: type(_)
Out[190]: numpy.int64
In [191]: np.array(3).item()
Out[191]: 3
In [192]: type(_)
Out[192]: int
In [193]: np.array(3)[()][()]
Out[193]: 3
The return of addition might be explained by 'array_priority'
dtype
is not preserved in operations like this. Add a float to an int, and get a float.
In [203]: type(np.array(3, np.int16) + 3)
Out[203]: numpy.int64
In [204]: type(np.array(3, np.int16) + 3.0)
Out[204]: numpy.float64
ufunc
casting
+
is actually a call to np.add
ufunc. ufunc
take key words like casting
that give finer control over what results can be:
In [214]: np.add(np.array(3, np.int16), 3)
Out[214]: 6
In [215]: np.add(np.array(3, np.int16), 3, casting='no')
Traceback (most recent call last):
File "<ipython-input-215-631cb3a3b303>", line 1, in <module>
np.add(np.array(3, np.int16), 3, casting='no')
UFuncTypeError: Cannot cast ufunc 'add' input 0 from dtype('int16') to dtype('int64') with casting rule 'no'
In [217]: np.add(np.array(3, np.int16), 3, casting='safe')
Out[217]: 6
https://numpy.org/doc/stable/reference/ufuncs.html#output-type-determination
I was speculating that __array_priority__
played a role in returning a np.int64
, but priorities go the wrong way.
In [194]: np.array(3).__array_priority__
Out[194]: 0.0
In [195]: np.int64(3).__array_priority__
Out[195]: -1000000.0
In [196]: np.array(3) + np.int64(3)
Out[196]: 6
In [197]: type(_)
Out[197]: numpy.int64
I don't know where it's documented, but often an operation will return a numpy scalar
rather than a 0d array.
I just remembered/discovered one difference between 0d and numpy scalar - mutability
In [222]: x
Out[222]: array(3)
In [223]: x[...] = 4
In [224]: x
Out[224]: array(4)
In [225]: x = np.int64(3)
In [226]: x[...] = 4
Traceback (most recent call last):
File "<ipython-input-226-f7dca2cc5565>", line 1, in <module>
x[...] = 4
TypeError: 'numpy.int64' object does not support item assignment
Python classes can share a lot of behaviors/methods, but differ in others.