3

I am new to python and am trying to reformat a CSV file, it is not delimited so all the data is in the first column. The file is very large say 1 gb to big for excel. The file format is as follows, note the time sequence in the old file is reversed newest to oldest, want to reverse it in new file with oldest to newest.

old file is
   tagname1 tagname2 value1(t)  value2(t) timestamp(t)
   tagname1 tagname2 value1(t-1)  value2(t-1) timestamp(t-1)
   tagname1 tagname2 value1(t-2)  value2(t-2) timestamp(t-2)

desired file format
   timestamp(t-2)    tagname1       tagname2   value1(t-2)    value2(t-2)
   timestamp(t-1)    tagname1       tagname2   value1(t-1)    value2(t-1)
   timestamp(t)      tagname1       tagname2   value1(t)    value2(t)

Thanks

huffytcs
  • 31
  • 1
  • 1
    You are saying all data is in the first column, but the format you give has 5 columns? – Feodoran Jul 07 '20 at 18:37
  • Can you load the whole file into memory (it might need more than the file size)? This would make things easier ... – Feodoran Jul 07 '20 at 18:44

2 Answers2

0

You can do it with awk. This command rearranges columns and sort by the first column and writes the result to new.csv

awk -F' ' '{print $5" "$1" "$2" "$3" "$4}' OFS=, "old.csv" | sort > new.csv

For a python solution, you can use for rearranging columns Python - re-ordering columns in a csv but sort having limited memory would be much simpler with sort.

Yann
  • 1,827
  • 1
  • 9
  • 23
0

You can use well-known pandas library to import and modify your .csv file. It is very popular and used very often. Here is some code:

import pandas as pd

data = pd.read_csv("file.csv", sep=" ", header=None) # 1
data = data.iloc[::-1] # 2
data = data[[4, 0, 1, 2, 3]] # 3

data.to_csv("out.csv", sep=" ", header=None, index=None) # 4
  1. Read .csv file.
    • Your file doesn't have any column names, so we pass header=None.
    • Use spaces as a separation character: sep=" ".
  2. Reverse array, the last rows will be first.
  3. Change the column order.
  4. Export your data to another .csv file.
    • Specify separation char: sep=" ".
    • We don't want any special row indices or headers: header=None, index=None.
Szymon Bednorz
  • 396
  • 3
  • 7
  • 22