121

Assume I have a csv.DictReader object and I want to write it out as a CSV file. How can I do this?

I know that I can write the rows of data like this:

dr = csv.DictReader(open(f), delimiter='\t')
# process my dr object
# ...
# write out object
output = csv.DictWriter(open(f2, 'w'), delimiter='\t')
for item in dr:
    output.writerow(item)

But how can I include the fieldnames?

martineau
  • 99,260
  • 22
  • 139
  • 249

3 Answers3

157

Edit:
In 2.7 / 3.2 there is a new writeheader() method. Also, John Machin's answer provides a simpler method of writing the header row.
Simple example of using the writeheader() method now available in 2.7 / 3.2:

from collections import OrderedDict
ordered_fieldnames = OrderedDict([('field1',None),('field2',None)])
with open(outfile,'wb') as fou:
    dw = csv.DictWriter(fou, delimiter='\t', fieldnames=ordered_fieldnames)
    dw.writeheader()
    # continue on to write data

Instantiating DictWriter requires a fieldnames argument.
From the documentation:

The fieldnames parameter identifies the order in which values in the dictionary passed to the writerow() method are written to the csvfile.

Put another way: The Fieldnames argument is required because Python dicts are inherently unordered.
Below is an example of how you'd write the header and data to a file.
Note: with statement was added in 2.6. If using 2.5: from __future__ import with_statement

with open(infile,'rb') as fin:
    dr = csv.DictReader(fin, delimiter='\t')

# dr.fieldnames contains values from first row of `f`.
with open(outfile,'wb') as fou:
    dw = csv.DictWriter(fou, delimiter='\t', fieldnames=dr.fieldnames)
    headers = {} 
    for n in dw.fieldnames:
        headers[n] = n
    dw.writerow(headers)
    for row in dr:
        dw.writerow(row)

As @FM mentions in a comment, you can condense header-writing to a one-liner, e.g.:

with open(outfile,'wb') as fou:
    dw = csv.DictWriter(fou, delimiter='\t', fieldnames=dr.fieldnames)
    dw.writerow(dict((fn,fn) for fn in dr.fieldnames))
    for row in dr:
        dw.writerow(row)
mechanical_meat
  • 144,326
  • 21
  • 211
  • 203
  • 13
    +1 Yet another way to write the header: `dw.writerow( dict((f,f) for f in dr.fieldnames) )`. – FMc Jun 05 '10 at 21:15
  • 2
    @Adam: for a shorter one-liner, see my answer. – John Machin Jun 05 '10 at 23:23
  • 2
    @John: +1 to your answer; simply utilising "the underlying writer instance" is certainly preferable to "laborious identity-mapping". – mechanical_meat Jun 05 '10 at 23:39
  • Can you make the new method more prominent? I read your answer but didn't notice it. – endolith Apr 30 '11 at 10:19
  • 1
    @endolith: thanks for the feedback. Moved that portion to top of answer. – mechanical_meat Apr 30 '11 at 14:36
  • Am I missing something? Why cant you just do dw.writeheader() – Derek Litz Aug 11 '11 at 19:11
  • 1
    Since you're using a dictReader as well, then it is easy to add the fields with `dw = csv.DictWriter(fou, delimiter='\t', fieldnames=dr.fieldnames)`. That way, if your fields change you do not need to adjust the dictWriter. – Spencer Rathbun Feb 07 '12 at 20:20
  • Oh and don't forget the *very important* `restval` and `extrasaction` [from the docs](http://docs.python.org/library/csv.html#csv.DictWriter). Nasty, brittle hacks are necessary if you don't use those. – Spencer Rathbun Feb 07 '12 at 20:47
30

A few options:

(1) Laboriously make an identity-mapping (i.e. do-nothing) dict out of your fieldnames so that csv.DictWriter can convert it back to a list and pass it to a csv.writer instance.

(2) The documentation mentions "the underlying writer instance" ... so just use it (example at the end).

dw.writer.writerow(dw.fieldnames)

(3) Avoid the csv.Dictwriter overhead and do it yourself with csv.writer

Writing data:

w.writerow([d[k] for k in fieldnames])

or

w.writerow([d.get(k, restval) for k in fieldnames])

Instead of the extrasaction "functionality", I'd prefer to code it myself; that way you can report ALL "extras" with the keys and values, not just the first extra key. What is a real nuisance with DictWriter is that if you've verified the keys yourself as each dict was being built, you need to remember to use extrasaction='ignore' otherwise it's going to SLOWLY (fieldnames is a list) repeat the check:

wrong_fields = [k for k in rowdict if k not in self.fieldnames]

============

>>> f = open('csvtest.csv', 'wb')
>>> import csv
>>> fns = 'foo bar zot'.split()
>>> dw = csv.DictWriter(f, fns, restval='Huh?')
# dw.writefieldnames(fns) -- no such animal
>>> dw.writerow(fns) # no such luck, it can't imagine what to do with a list
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\python26\lib\csv.py", line 144, in writerow
    return self.writer.writerow(self._dict_to_list(rowdict))
  File "C:\python26\lib\csv.py", line 141, in _dict_to_list
    return [rowdict.get(key, self.restval) for key in self.fieldnames]
AttributeError: 'list' object has no attribute 'get'
>>> dir(dw)
['__doc__', '__init__', '__module__', '_dict_to_list', 'extrasaction', 'fieldnam
es', 'restval', 'writer', 'writerow', 'writerows']
# eureka
>>> dw.writer.writerow(dw.fieldnames)
>>> dw.writerow({'foo':'oof'})
>>> f.close()
>>> open('csvtest.csv', 'rb').read()
'foo,bar,zot\r\noof,Huh?,Huh?\r\n'
>>>
John Machin
  • 75,436
  • 11
  • 125
  • 178
  • Currently in Python 3.6, the `extrasaction` functionality seems to implemented better. It's now `wrong_fields = rowdict.keys() - self.fieldnames so it's effectively a `set` operation. – martineau Apr 29 '17 at 18:19
  • I'm voting this answer up for the 'avoid DictWriter' comment - I haven't seen any advantage to using it, and seems quicker to structure your data and use csv.writer – neophytte Nov 07 '19 at 02:21
8

Another way to do this would be to add before adding lines in your output, the following line :

output.writerow(dict(zip(dr.fieldnames, dr.fieldnames)))

The zip would return a list of doublet containing the same value. This list could be used to initiate a dictionary.

Raphael Pr
  • 794
  • 8
  • 26