23

Using Python 3.3 on Windows 8, when writing to a CSV file, I get the error TypeError: 'str' does not support the buffer interface and "wb" flag was used. However when only the "w" flag was used, I get no errors, but every row is separated by a blank row!

Problem Writing

Code

test_file_object = csv.reader( open("./files/test.csv", 'r') )
next(test_file_object )

with open("./files/forest.csv", 'wb') as myfile:
    open_file_object = csv.writer( open("./files/forest.csv", 'wb') )
    i = 0
    for row in test_file_object:
        row.insert(0, output[i].astype(np.uint8))
        open_file_object.writerow(row)
        i += 1

Error

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-121-8cbb94f602a8> in <module>()
      8     for row in test_file_object:
      9         row.insert(0, output[i].astype(np.uint8))
---> 10         open_file_object.writerow(row)
     11         i += 1

TypeError: 'str' does not support the buffer interface

Problem Reading

When reading, I cant seem to use the "rb" flags so it will give the error iterator should return strings, not bytes when trying to ignore the first row (headers).

Code

csv_file_object = csv.reader(open('files/train.csv', 'rb'))
header = next(csv_file_object)
train_data = []
for row in csv_file_object:
    train_data.append(row)
train_data = np.array(train_data)

Error

Error                                     Traceback (most recent call last)
<ipython-input-10-8b13d1956432> in <module>()
      1 csv_file_object = csv.reader(open('files/train.csv', 'rb'))
----> 2 header = next(csv_file_object)
      3 train_data = []
      4 for row in csv_file_object:
      5     train_data.append(row)

Error: iterator should return strings, not bytes (did you open the file in text mode?)
Fred Foo
  • 328,932
  • 68
  • 689
  • 800
Nyxynyx
  • 52,075
  • 130
  • 401
  • 707
  • I think this is a duplicate of [this question](http://stackoverflow.com/questions/14693646/writing-to-csv-file-python), among others. How you open csv files is different in Python 3; see the opens for `reader` and `writer` [here](http://docs.python.org/dev/library/csv.html#module-contents). – DSM Apr 29 '13 at 12:12

2 Answers2

34

The 'wb' mode was OK for Python 2. However, it is wrong in Python 3. In Python 3, the csv reader needs strings, not bytes. This way, you have to open it in text mode. However, the \n must not be interpreted when reading the content. This way, you have to pass newline='' when opening the file:

with open("./files/forest.csv", newline='') as input_file \
     open('something_else.csv', 'w', newline='') as output_file:
    writer = csv.writer(output_file)
    ...

If the file is not pure ASCII, you should also consider to add the encoding=... parameter.

pepr
  • 18,012
  • 11
  • 66
  • 122
  • 3
    When creating a CSV file from scratch, all I needed to do was `open('filename.csv', 'w', newline='')`. Thanks! – Jace Browning Jan 29 '14 at 15:49
  • 2
    Thanks was having this exact issue with Python 3. From this other item http://stackoverflow.com/questions/8746908/why-does-csv-file-contain-a-blank-line-in-between-each-data-line-when-outputting they also suggest using the Lineterminator='\n' parameter with the csv module's various writer functions. Which approach is better? I suppose it might be more pythonic to use this newline option with the built-in open() function, rather than controlling this behaviour via the imported csv? – Davos May 02 '14 at 03:43
3

Hi this may help you.

First Problem,

change

with open("./files/forest.csv", 'wb') as myfile:
    open_file_object = csv.writer( open("./files/forest.csv", 'wb') )

to

with open("./files/forest.csv", 'w+') as myfile:
    open_file_object = csv.writer( open("./files/forest.csv", 'w+') )

Second problem:

Same exact thing, except change to r+

If that doesn't work, you can always just use this to strip out all the blank rows after it's created.

for row in csv:
    if row or any(row) or any(field.strip() for field in row):
        myfile.writerow(row)

Also, a little lesson. "rb" stands for reading bytes, essentially think of it as reading an integer only. I'm not to sure what's the content of your csv; however, there must be strings in that csv.

This will help for future reference.

The argument mode points to a string beginning with one of the following sequences (Additional characters may follow these sequences.):

 ``r''   Open text file for reading.  The stream is positioned at the
         beginning of the file.

 ``r+''  Open for reading and writing.  The stream is positioned at the
         beginning of the file.

 ``w''   Truncate file to zero length or create text file for writing.
         The stream is positioned at the beginning of the file.

 ``w+''  Open for reading and writing.  The file is created if it does not
         exist, otherwise it is truncated.  The stream is positioned at
         the beginning of the file.

 ``a''   Open for writing.  The file is created if it does not exist.  The
         stream is positioned at the end of the file.  Subsequent writes
         to the file will always end up at the then current end of file,
         irrespective of any intervening fseek(3) or similar.

 ``a+''  Open for reading and writing.  The file is created if it does not
         exist.  The stream is positioned at the end of the file.  Subse-
         quent writes to the file will always end up at the then current
         end of file, irrespective of any intervening fseek(3) or similar.
AdriVelaz
  • 513
  • 4
  • 14
  • 1
    I'm not sure why you're suggesting to use the `+` modes for the files. They're each only doing one thing (reading or writing) never both, so that's completely unnecessary. – Blckknght May 23 '13 at 00:14
  • Sorry, I read through the question way to quickly. The stripping portion of my answer should work though. – AdriVelaz May 23 '13 at 15:03