23

I have a bunch of csv files with the same columns but in different order. We are trying to upload them with SQL*Plus but we need the columns with a fixed column arrange.

Example

required order: A B C D E F

csv file: A C D E B (sometimes a column is not in the csv because it is not available)

is it achievable with python? we are using Access+Macros to do it... but it is too time consuming

PS. Sorry if anyone get upset for my English skills.

Joshua
  • 34,237
  • 6
  • 59
  • 120
JJ1603
  • 496
  • 1
  • 5
  • 16

5 Answers5

32

You can use the csv module to read, reorder, and then and write your file.

Sample File:

$ cat file.csv
A,B,C,D,E
a1,b1,c1,d1,e1
a2,b2,c2,d2,e2

Code

import csv

with open('file.csv', 'r') as infile, open('reordered.csv', 'a') as outfile:
    # output dict needs a list for new column ordering
    fieldnames = ['A', 'C', 'D', 'E', 'B']
    writer = csv.DictWriter(outfile, fieldnames=fieldnames)
    # reorder the header first
    writer.writeheader()
    for row in csv.DictReader(infile):
        # writes the reordered rows to the new file
        writer.writerow(row)

output

$ cat reordered.csv
A,C,D,E,B
a1,c1,d1,e1,b1
a2,c2,d2,e2,b2
Josh J
  • 6,493
  • 2
  • 19
  • 44
  • 1
    Really nice use of `DictReader`/`DictWriter`. – Peter Wood Oct 07 '15 at 21:13
  • Beautiful solution. However, how can I make it work on input files that are delimited by a semicolon? – user1192748 Jun 22 '18 at 12:33
  • @user1192748 the docs for csv.reader [python2](https://docs.python.org/2/library/csv.html#csv.reader) [python3](https://docs.python.org/3.6/library/csv.html#csv.reader) define `delimiter=` and `quotechar=` kwargs to specify the delimiter and quoiting character. – Josh J Jun 22 '18 at 18:44
  • Keep in mind that `DictWriter` for Python 2, doesn't support unicode characters as mentioned https://docs.python.org/2/library/csv.html – Navidot Jul 10 '18 at 14:28
9

So one way to tackle this problem is to use pandas library which can be easily install using pip. Basically, you can download csv file to pandas dataframe then re-order the column and save it back to csv file. For example, if your sample.csv looks like below:

A,C,B,E,D                                                                                                                
a1,b1,c1,d1,e1                                                                                                           
a2,b2,c2,d2,e2 

Here is a snippet to solve the problem.

import pandas as pd
df = pd.read_csv('/path/to/sample.csv')
df_reorder = df[['A', 'B', 'C', 'D', 'E']] # rearrange column here
df_reorder.to_csv('/path/to/sample_reorder.csv', index=False)
titipata
  • 4,595
  • 2
  • 29
  • 48
3
csv_in  = open("<filename>.csv", "r")
csv_out = open("<filename>.csv", "w")

for line in csv_in:
    field_list = line.split(',')    # split the line at commas
    output_line = ','.join(field_list[0],   # rejoin with commas, new order
                           field_list[2],
                           field_list[3],
                           field_list[4],
                           field_list[1]
                           )
    csv_out.write(output_line)

csv_in.close()
csv_out.close()
Prune
  • 72,213
  • 14
  • 48
  • 72
1

You can use something similar to this to change the order, replacing ';' with ',' in your case. Because you said you needed to do multiple .csv files, you could use the glob module for a list of your files

for file_name in glob.glob('<Insert-your-file-filter-here>*.csv'):
    #Do the work here
Community
  • 1
  • 1
1

The csv module allows you to read csv files with their values associated to their column names. This in turn allows you to arbitrarily rearrange columns, without having to explicitly permute lists.

for row in csv.DictReader(open("foo.csv")):
  print row["b"], row["a"]

2 1
22 21

Given the file foo.csv:

a,b,d,e,f
1,2,3,4,5
21,22,23,24,25
MisterMiyagi
  • 26,337
  • 5
  • 60
  • 79