0

I have two lists where I want to left join one.

This is how I generate the first list:

# NB Oracle connection
first_list = pd.read_sql('SELECT firstname, lpad(identifier, 4, 0) from table1', con=Connection)

Second list:

second_list = pd_read_excel('excelfile.xlsx', converters={'identifier': str})

A little bit simplified: The first list looks like this:

   name  identifier
0  Firstname     1201
1  Secondname       0
2  Thirdname     1855
3  Fourthname    3333
4  Fifthname     1414

The Second list looks like this:

   identifier    BIC          BANK
0  0             UNKNOWN      Unknown
1  1201          DNBANKNOKK   DNB Bank ASA 
2  1855          EIDSNO21     Eidsberg Sparebank
3  3333          RYGSNO21     Sparebank1 Østfold Akershus
4  1414          LOSKNO21     Sparebank 1 Lom og Sjåk

I want a final list to look like this:

   name          identifier  BIC          BANK
0  Firstname     1201        DNBANKNOKK   DNB Bank ASA
1  Secondname       0        UNKNOWN      Unknown
2  Thirdname     1855        EIDSNO21     Eidsberg Sparebank
3  Fourthname    3333        RYGSNO21     Sparebank1 Østfold Akershus
4  Fifthname     1414        LOSKNO21     Sparebank 1 Lom og Sjåk

But, When I try to genearte the final list:

final_list = first_list.join(second_list, on='identifier', how='left')

I get the following error: ValueError: You are trying to merge on object and int64 columns. If you wish to proceed you should use pd.concat

I get the same error when I try this:

first_list['identifier'] = first_list['identifier'].astype(str)
second_list['identifier'] = second_list['identifier'].astype(str)
final_list = first_list.join(second_list, on='identifier', how='left')

Running dtypes returns both list identifiers as object.

Lock-Aze
  • 28
  • 1
  • 6

2 Answers2

2

Use pandas.DataFrame.merge with how='left':

final_list = pd.merge(first_list, second_list, on='identifier', how='left')

print(final_list)
         name  identifier         BIC                         BANK
0   Firstname        1201  DNBANKNOKK                 DNB Bank ASA
1  Secondname           0     UNKNOWN                      Unknown
2   Thirdname        1855    EIDSNO21           Eidsberg Sparebank
3  Fourthname        3333    RYGSNO21  Sparebank1 Østfold Akershus
4   Fifthname        1414    LOSKNO21      Sparebank 1 Lom og Sjåk

A good an extensive explenation about merging in pandas can be found here:
Pandas Merging 101

Erfan
  • 31,924
  • 5
  • 41
  • 51
  • Thanks, using merge with left join solved the problem :) I guess merge is "better" than join. I'll keep that in mind. – Lock-Aze Mar 21 '19 at 11:49
0

try like below use merge

import pandas as pd

final_list = pd.merge(first_list, second_list, how='outer',left_on=['identifier'], right_on=['identifier'])
Zaynul Abadin Tuhin
  • 28,879
  • 5
  • 20
  • 49