Loop through multiple csv files, copying only certain columns to new files

Question

I have a number of .csv files in a folder (1.csv, 2.csv, 3.csv, etc.) and I need to loop over them all. The output should be a corresponding NEW file for each existing one, but each should only contain 2 columns.

Here is a sample of the csv files:

004,444.444.444.444,448,11:16 PDT,11-24-15
004,444.444.444.444,107,09:55 PDT,11-25-15
004,444.444.444.444,235,09:45 PDT,11-26-15
004,444.444.444.444,241,11:00 PDT,11-27-15

And here is how I would like the output to look:

448,11-24-15
107,11-25-15
235,11-26-15
241,11-27-15

Here is my working attempt at achieving this with Python:

import csv
import os
import glob

path = '/csvs/'
for infile in glob.glob( os.path.join(path, '*csv') ):


    inputfile = open(infile, 'r') 
    output = os.rename(inputfile + ".out", 'w')

#Extracts the important columns from the .csv into a new file
with open(infile, 'r') as source:
    readr = csv.reader(source)
    with open(output,"w") as result:
        writr = csv.writer(result)
        for r in readr:
            writr.writerow((r[4], r[2]))

Using only the second half of this code, I am able to get the desired output by specifying the input files in the code. However, this Python script will be a small part of a much larger bash script that will be (hopefully) fully automated.

How can I adjust the input of this script to loop over each file and create a new one with just the 2 specified columns?

Please let me know if there is anything I need to clarify.

This file is sitting in the same folder as the files I am looping through. — smoothjabz, Jul 30 '15 at 04:18

score 0 · Answer 1 · edited May 23 '17 at 12:14

0

You can use pandas library. It offers several functionality for dealing with csv files. read_csv will read the csv file for you and give you a dataframe object. Visit this link to get example about how to write csv file from pandas dataframe. Moreore there are lot of tutorials available on the net.

edited May 23 '17 at 12:14

Community

1
1

answered Jul 30 '15 at 04:17

Mangu Singh Rajpurohit

8,894
2
50
76

score 0 · Accepted Answer · answered Jul 30 '15 at 04:19

inputfile file is a file you openned , but then you are doing -

os.rename(inputfile + ".out", 'w')

This does not work, you are trying to add a string and the openned file using + operator. I am not even sure why you need that line or even the line - inputfile = open(infile, 'r') . You are openning the file again in the with statement.

Another issue -

You specify your path as - path = '/csvs/' , it is highly unlikely that you have a 'csvs' directory under root directory. You may have wanted to use some other relative directory, so you should use relative directory.

You can just do -

path = 'csvs/'
for infile in glob.glob( os.path.join(path, '*csv') ):
    output = infile + '.out'
    with open(infile, 'r') as source:
        readr = csv.reader(source)
        with open(output,"w") as result:
            writr = csv.writer(result)
            for r in readr:
                writr.writerow((r[4], r[2]))

Loop through multiple csv files, copying only certain columns to new files

2 Answers2