I hacked together some code that I thought would print all columns names that are common to all CSV files in a folder. I'm using an inner join, but it's acting like an outer join. There must be a quick fix for this, right.
import glob
import pandas as pd
files = glob.glob(r'C:\my_files\*.csv')
def get_merged(files, **kwargs):
df = pd.read_csv(files[0], **kwargs)
for f in files[1:]:
df = df.merge(pd.read_csv(f, **kwargs), how='inner')
return df
print(get_merged(files))
So, if I have 4 files with these columns:
cola colb colc cold cole
And I have 1 file with these columns:
cola colc cole
I would like to see this:
cola colc cole