0

I am reading an excel file and I was wondering how I could create a loop that reads specific rows based on a pattern. Let's say for example I want to read the first three rows from my excel sheet (which correspond to 0,1,2) and then read rows 10,11,12 (which correspond to 9,10,11) and so on. The total number of rows is 156.

Hypothesis:
import numpy as np 
import pandas as pd 
df = pd.read_excel("My Excel.xlsx")
a = df.iloc[[0,1,2]]
b = [x + 1 for x in a] 



My code as it is right now.

import numpy as np 
import pandas as pd 
df = pd.read_excel("My Excel.xlsx")
A1 = df.iloc[[0,1,2]].mean() #This is my data transformation
A11 = 1 /(numpy.log10(A2))  # This is also 
A2 = df.iloc[[9,10,11]].mean()
A22 = 1/(numpy.log10(A2)).mean()
....
.... Doing this procedure for other teams of data (3 rows each team)
A17 = df.iloc[[146,147,148]].mean()
A177 = 1/(numpy.log10(A17)) 

So my question is how can I create a loop choosing specific rows out of my Excel file and applying my transformations to it without doing each one at a time. What if I had 1000 data; It would be a disaster.

Thank you for your time. With respect!

Jack21
  • 39
  • 7
  • What are you trying to do? Loops in python are the last resort.... – gtomer Feb 17 '21 at 21:58
  • I have updated my question, I hope that I was more clear this time. Thank you again. – Jack21 Feb 18 '21 at 08:21
  • You select [0,1,2] and then [9,10,11], why you jump from 2 to 9, what is the logic to automate that? – Rafael Valero Feb 18 '21 at 08:28
  • So for you is ok, to read in the pandas dataframe, right? – Rafael Valero Feb 18 '21 at 08:28
  • Rafael Valero The data represent NIR spectra and each group (0,1,2 and 9,10,11 and so on) represents specific and different measurements. For example, 0,1,2 and the rest stands for lipids. Rows 3,4,5 stands for protein and so on. Yes, I am not dealing a problem with reading pandas dataframe. With respect. – Jack21 Feb 18 '21 at 09:11
  • To sum up I have 3 groups one for lipids one for protein and one for sugars. The rows for each group of data are in the following form: lipids (row 1 = 0, row 2 =1, row 3 = 2, row 10 =9, row 11 = 10 till 156), protein (row 4 = 3, row 5=4, row 6=5 till 156), sugars ( row 7= 6, row 8=7, row 9=8). – Jack21 Feb 18 '21 at 09:55

1 Answers1

1
  1. I think perhaps the easier is to upload all the data into a pandas dataframe and work there using the index of the dataframe.
  2. Perhaps this is also interesting. As in here https://stackoverflow.com/a/43257810/7127519
import xlrd
workbook = xlrd.open_workbook('All Train data.xlsx')
worksheet = workbook.sheet_by_index(0)
worksheet.cell(0, 0).value

I was using this for a very large file and worked for me. So I think is not reading all the sheet, but I am not 100% sure, on that.

Rafael Valero
  • 2,067
  • 13
  • 25
  • That's what I did but still cannot make the row selection automatic. I have to choose the rows by typing code for each one of them. I have updated my code so check it out if you don't mind. Thank you. – Jack21 Feb 18 '21 at 08:24