I read csv with more than 300 columns. Many columns have similar names.
Shortened example from the csv:
index d c a b b a a a
0 data data data data data data data data
Python automatically adds numbers at the end of columns names (if they are similar) as a suffix to ensure each column has a unique name.
Example:
index d c a b b.1 a.1 a.2 a.3
0 data data data data data data data data
My assignment is to sort the columns alphabetically and to add zeros (leading zeros) to the suffix.
Desired output:
index a a.01 a.02 a.03 b b.01 c d
0 data data data data data data data data
My code below (taken from stackoverflow) can sort the column. But I have no idea how to make suffix with leading zeros?
import pandas as pd
df= pd.read_csv(r"C:\Users\.....\file.csv", skipinitialspace=False, sep=';', header= 0, index_col = 'DateTime', low_memory=False)
df = df.sort_index(axis=1) # sort the column
pd.set_option('display.max_columns', None) # view all columns without truncated
Any ideas?
Edited question
After sorting the columns, I want,
a.01 a02 a12
rather than
a.01 a.012 a.02