1

Using numpy.savetxt saves the data as a continuous line, how can I preserve the shape of array?

Example:

    arr = np.array([[1,'some',3,4],
                    [5,'wo,dy',7,8],
                    [9,'word',11,12]])

    np.savetxt('example.csv',arr,fmt=('%s'),delimiter='\t')

(why tab delimited? - some of the real text has commas in it)

example.csv looks like this:

    1   some    -3.0    4   5   wo,dy   -7.0    8   9   word    -11.0   12

how can I get it to look like this:

    1   some    -3.0    4   
    5   wo,dy   -7.0    8   
    9   word    -11.0   12

(so that I can read it into excel or SQL?)

Harry de winton
  • 630
  • 9
  • 21
  • specify `delimiter=','`. – shivsn Jul 11 '17 at 12:14
  • makes no difference, as I mentioned the delimiter is only that way because my actual txt has commas and other annoying stuff in it, importing it into excel would go wrong if the delimiter was ',' – Harry de winton Jul 11 '17 at 12:17
  • ok got it but you should include that in sample. – shivsn Jul 11 '17 at 12:17
  • Yeah, I guess my question is are we sure this will work for tabs as well even though the file its self looks like one big line, because I have to run this for ~three day and I don't wanna come back to useless data – Harry de winton Jul 11 '17 at 12:21
  • I will try and let you know but the question has been marked as duplicate. – shivsn Jul 11 '17 at 12:33
  • I don't think the 'duplicate' is relevant. `savetxt` should handle a 2d array without problem. Tab delimiting is fine, though for testing this I'd prefer ';'. It shouldn't be producing one line. The mix of strings and numbers is a bit of a nuisance, but the '%s' formating is good start. – hpaulj Jul 11 '17 at 12:44
  • @hpaulj how to remove the duplicate tag because the question is not a duplicate. – shivsn Jul 11 '17 at 12:46
  • Someone with enough rep can reopen it. I can't do it from this tablet. – hpaulj Jul 11 '17 at 12:47
  • @hpaulj ok do you have a solution in mind I couldn't figure it out. – shivsn Jul 11 '17 at 12:50
  • The example doesn't match the output. One has int '3', the other '-3.0'. An important issue is the array `dtype`. String, object, or structured? – hpaulj Jul 11 '17 at 13:05

1 Answers1

1

If arr really is as shown, savetxt shouldn't have problems:

In [647]: arr = np.array([[1,'some',3,4],
     ...:                     [5,'wo,dy',7,8],
     ...:                     [9,'word',11,12]])

In [648]: 
In [648]: arr
Out[648]: 
array([['1', 'some', '3', '4'],
       ['5', 'wo,dy', '7', '8'],
       ['9', 'word', '11', '12']], 
      dtype='<U11')

In [650]: np.savetxt('text.txt', arr, fmt='%13s', delimiter=';')

In [651]: cat text.txt
            1;         some;            3;            4
            5;        wo,dy;            7;            8
            9;         word;           11;           12

'\t` works fine too.

I don't see how you can get one line unless you either flatten the array first, or specify the newline parameter. Also it wouldn't change the '3' to `-3.0'.

With a mix of strings and numbers the array could also be object dtype or structured. But as written it is an array of string type. '%s' is the correct formatter.

hpaulj
  • 175,871
  • 13
  • 170
  • 282
  • I freaked because I opened the document in notepad and everything was squished into a single line. I foolishly assumed that was how Excel/SQL would see it and my data would be unusable – Harry de winton Jul 12 '17 at 08:37