9

I have a numpy array with numeric data of the form:

example = numpy.array([[[i for i in range(0, 5)],[0 for j in range(0, 5)]] for k in range(0, 10)])

So it's array of 10 groups, where each group consists of 2 lists of equal length and contains only numbers. Running the following save code gives me the error below:

numpy.savetxt('exampleData.csv', test, delimiter=',')
TypeError: Mismatch between array dtype ('int32') and format specifier ('%.18e %.18e')

I'm guessing this could be fixed with something in the fmt='xyz' argument, but the documentation isn't particularly clear. Any help would be appreciated.

(In my actual data, the i and j lists are lists of long floats, e.g.'0.0047322940571' etc.)

Pingk
  • 502
  • 2
  • 7
  • 17
  • 1
    Try `example = numpy.array([[[float(i) for i in range(0, 5)],[0 for j in range(0, 5)]] for k in range(0, 10)])` and see if the error persists. Alternatively, try a format like `fmt='%04d'` in the `savetxt`call. – vmg Apr 11 '16 at 15:30
  • @vmg In my actual code, the data for i and j are both floats, I think the error stems from the fact that it's not expecting a third value in k. – Pingk Apr 11 '16 at 15:39
  • What's the shape of the array? `savetxt` only works with 2d arrays. What's its `dtype`? – hpaulj Apr 11 '16 at 15:51
  • @hpaulj Ahh, that'd explain it, I guess it would classify as a 3D array... What should I use instead? – Pingk Apr 11 '16 at 15:53
  • What kind of layout do you expect? Normally CSV is just many rows of matching columns. Readers typically have problems with blank lines or rows with different numbers of columns. Columns may differ in the kind of content - strings, ints, floats, but they should be consistent. – hpaulj Apr 11 '16 at 16:25
  • Opening a 2D CSV file produces a two-line document, I would have thought that a 3D array would produce 2*k lines... – Pingk Apr 11 '16 at 17:07

2 Answers2

9

Your example is a 3d array

In [82]: example=np.array([[[i for i in range(0, 5)],[0 for j in range(0, 5)]] for  k in range(0, 3)])  # chg 10 to 3 for display

In [83]: example.shape
Out[83]: (3L, 2L, 5L)

In [84]: example
Out[84]: 
array([[[0, 1, 2, 3, 4],
        [0, 0, 0, 0, 0]],

       [[0, 1, 2, 3, 4],
        [0, 0, 0, 0, 0]],

       [[0, 1, 2, 3, 4],
        [0, 0, 0, 0, 0]]])

trying to save the whole thing results in an error (different message due to different version):

In [87]: np.savetxt('test.csv',example, delimiter=',')
....
TypeError: float argument required, not numpy.ndarray 

but saving one 'row' is ok

In [88]: np.savetxt('test.csv',example[1,...], delimiter=',')

Save with integer format makes a prettier output

In [94]: np.savetxt('test.csv',example[1,...], delimiter=',',fmt='%d')

In [95]: with open('test.csv') as f:print f.read()
0,1,2,3,4
0,0,0,0,0

So how do you want the 3d array to be saved? Keep in mind how you will use it/read it. Multiple files? Multiple blocks within one file?

https://stackoverflow.com/a/3685339/901925 is a 6 yr old SO answer on how to save a 3d array. The simple answer is to open a file, and perform multiple savetxt for slices of the array. This saves the data in blocks. But loading those blocks is another SO question (which has come up before).

In [100]: with open('test.csv','w') as f:
     ...:     for row in example:
     ...:         np.savetxt(f,row,delimiter=',',fmt='%d',footer='====')
     ...:         

In [101]: with open('test.csv') as f:print f.read()
0,1,2,3,4
0,0,0,0,0
# ====
0,1,2,3,4
0,0,0,0,0
# ====
0,1,2,3,4
0,0,0,0,0
# ====

In response to your comment, this works

example=np.ones((4,2,100))
np.savetxt('test.csv',example[1,...], delimiter=',',fmt='%.18e')

Another way to save a 3d array is to reshape it to 2d. You reshape it back to 3d after loading, possibly using information that you stored in a comment line

np.savetxt('test.csv',example.reshape(-1,example.shape[-1]), delimiter=',',fmt='%.18e')
Community
  • 1
  • 1
hpaulj
  • 175,871
  • 13
  • 170
  • 282
  • Thanks, but now I get a different error using your In[100] line and 'fmt=%.18e'. My actual array has the shape (4L, 2L, 100L), and I get the error TypeError: Mismatch between array dtype ('float64') and format specifier ('%.18e %.18e...[x100]) – Pingk Apr 11 '16 at 18:12
  • `savetxt` iterates on the 1st dimension of your array, and for each `row`, writes `format % tuple(row)`. `format` is derived from your `fmt` parameter and the `.shape[1]` of your input. – hpaulj Apr 11 '16 at 18:51
0
import numpy

example = numpy.array([[[i for i in range(0, 5)],[0 for j in range(0, 5)]] for k in range(0, 10)])
f = open('exampleData.csv', 'ab')
for i in example:
    numpy.savetxt(f, i, fmt='%i')
Chris
  • 7,779
  • 3
  • 17
  • 29
  • I tried fmt='%1.10E, %1.10E, &04d' for i, j and k respectively but I get a SyntaxError. The problem seems to be iterating through the k list? – Pingk Apr 11 '16 at 15:50