2

Hello I am trying to read and open two excel files into one data frame however I get this error.

AttributeError: 'dict' object has no attribute 'parse'

My objective is to use pandas and merge these two xlsx files into a data frame. How do I this. Help appreciated Here is my code below:

# import modules
from IPython.display import display
import pandas as pd
import numpy as np
pd.set_option("display.max_rows", 999)
pd.set_option('max_colwidth',100)
%matplotlib inline

# filenames
file_names = ["data/OrderReport.xlsx", "data/OrderReport2.xlsx"]

reading_files = [(pd.read_excel(f, sheetname=None, parse_cols=None))for f in file_names]

frames = [x.parse(x.sheet_names[0], header=None,index_col=None) for x in reading_files]
glibdud
  • 7,131
  • 2
  • 23
  • 34
Deepak M
  • 792
  • 2
  • 10
  • 24
  • Can't speak to the greater goal, but for your particular error, check out the [`pandas.read_excel()` documentation](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_excel.html). Passing `sheetname=None` causes it to return a **dict** of dataframes. Therefore `reading_files` is a list of dicts, and `x` will be a dict, which has no `parse()` method. – glibdud Jan 25 '17 at 18:11
  • I created another question with code that actually works My only problem here was i was not getting back the column names if you can look at this one instead would appreciate the help !http://stackoverflow.com/questions/41841757/not-getting-back-the-column-names-after-reading-into-an-xlsx-file/41842231?noredirect=1#comment70871824_41842231 @glibdud – Deepak M Jan 26 '17 at 02:43
  • Please don't change the nature of the question after you've gotten answers. If you have a new question, post it as a new question. I've rolled back to the previous version. You can look at the [revision history](http://stackoverflow.com/posts/41858471/revisions) if you want to use any of your added information. – glibdud Jan 27 '17 at 03:27

1 Answers1

1

With the "new" read_excel function it creates a dict of DataFrames (if you pass sheetname=None), there's no need to call parse (as there is no ExcelFile). Previously you had to create an ExcelFile and then parse each sheet. See here.

Therefore reading_files is a list of dicts of DataFrames... It's unclear how you want to merge this into a single-DataFrame (there's lots of choices!).

Community
  • 1
  • 1
Andy Hayden
  • 291,328
  • 80
  • 565
  • 500
  • Thanks for the reply, I did a few adjustments to the code above and got the dataframe to work but Im having a issue with the column names, would appreciate if you could take a look above ! @Andy Hayden – Deepak M Jan 26 '17 at 21:22