-1

I am currently doing a project where i have to make some transformations and cleaning from an excel file that has the same table in each sheet only that each sheet is representing a different month (different values in the tables).

Therefore the code that will make the transformations is the same only for a different sheet.

df_at_jan  = pd.read_excel("C:/Users/Spiros/Desktop/Reporting.xlsx",
                      sheet_name='Jan 2018')
df_at_feb  = pd.read_excel("C:/Users/Spiros/Desktop/Reporting.xlsx",
                          sheet_name='Feb 2018')

df_at_jan.drop([0,1,2],axis=0)
df_at_jan.columns = df_at_jan.iloc[3]

df_at_feb.drop([0,1,2],axis=0)
df_at_feb.columns = df_at_feb.iloc[3]

Of course i have to do this for all the months and i was wondering how i can do this with a for loop in order not to re-create the code for every different month.

I am very new to Python so any help will be much appreciated.

Thanks a lot!

2 Answers2

0

1) create function

def read_one_month(sheet_name):
   df = pd.read_excel("C:/Users/Spiros/Desktop/Reporting.xlsx",
                      sheet_name=sheet_name)
   df.drop([0,1,2],axis=0, inplace = True)
   return df

2) define array of sheet names and run a loop:

df = pd.DataFrame(None)
for sheet_name in ['Jan 2018','Feb 2018']:
   df = pd.concat([df, read_one_month(sheet_name)], axis = 0, ignore_index = True)

Dennis Lyubyvy
  • 560
  • 4
  • 8
0

I can't test because you don't give a data example, but the according to this and this answers, you can try:

xl = pd.ExcelFile('C:/Users/Spiros/Desktop/Reporting.xlsx')

sheets = xl.sheet_names

df_array = []

for sheet in sheets:
  df_temp = pd.read_excel(xl , sheet)
  df_temp = df_temp.drop([0,1,2],axis=0)
  df_temp.columns = df_temp.iloc[3]

  df_array.append(df_temp)

The data of each month will be in a position of the array.

If you need all the data became a single DataFrame, do:

df = pd.concat(df_array, ignore_index = True)
Terry
  • 2,331
  • 2
  • 10
  • 22