0

I currently have a pandas dataframe. The concatenation of the 1st and 2nd columns results in the 3rd column.

I've tried the df.fillna(") method to cope with the NaN values. However I need to get rid of the NaN's in the concatenated column. While the above method only gets rid of the existing columns.

import pandas as pd
import numpy as np
data = [[], ['arthur','shelby',''], ['michael','','']]
df = pd.DataFrame(data, columns = ['Name', 'LastName','FullName'])
df['FullName'] = df['Name'].map(str) + ' ' + (df['LastName'].map(str))
df1 = df.fillna("")
print(df1)

The output results column contains NaN NaN for the 1st row. However I'm expecting it to be " " " " enter image description here.

Anyway to achieve this???

Pv66
  • 57
  • 7
  • See here https://stackoverflow.com/questions/26837998/pandas-replace-nan-with-blank-empty-string – po.pe Sep 06 '19 at 12:17

2 Answers2

1

You need to use inplace in this way:

import pandas as pd
import numpy as np
data = [[], ['arthur','shelby',''], ['michael','','']]
df = pd.DataFrame(data, columns = ['Name', 'LastName','FullName'])
df['FullName'] = df['Name'].map(str) + ' ' + (df['LastName'].map(str))
df.fillna("",inplace=True)
print(df)

But the real problem you have is that you are inserting a blank line in data, so you don't have NaN's but you have None. You should do this (remove in your data the first empty list):

import pandas as pd
import numpy as np
data = [['arthur','shelby',''], ['michael','','']]
df = pd.DataFrame(data, columns = ['Name', 'LastName','FullName'])
df['FullName'] = df['Name'].map(str) + ' ' + (df['LastName'].map(str))
df.fillna("",inplace=True)
print(df)

EDIT I found your problem, you need to do the fillna before any other manipulation, because you are concatenating two nans converted to string:

import pandas as pd
import numpy as np
data = [[], ['arthur','shelby',''], ['michael','','']]
df = pd.DataFrame(data, columns = ['Name', 'LastName','FullName'])
df.fillna("",inplace=True)
df['FullName'] = df['Name'].map(str) + ' ' + (df['LastName'].map(str))
print(df)
nacho
  • 4,578
  • 2
  • 17
  • 28
  • Can you suggest for NaN's as well? The actual problem I'm solving has 1st row components as following 1st column : "", 2nd column:"", result :nan nan – Pv66 Sep 06 '19 at 12:38
  • To summarize for you: I've created the dataframe from an excel file. the 1st and 2nd columns of the 1st row are blank. And the concatentaion of these 2 columns is giving nan nan. It's fine, I just want to replace the nan nan with " " " " in the result column. – Pv66 Sep 06 '19 at 12:47
  • **df.fillna("",inplace=True)** This should replace all your NaN with '' in your df. You need to use inplace=True, so it modifies the DF – nacho Sep 06 '19 at 12:48
  • If you don't want to read that first row from excel (because it is blank) you can use read_excel(skiprows=[1]) (In case is the row number 1 you want to skip – nacho Sep 06 '19 at 12:51
  • df = pd.DataFrame(df, columns ={ 'A', 'B', 'C', 'D'}) df['D'] = df['B].map(str) + ' ' + (df['C'].map(str)) df.fillna(" ", inplace = True) I've tried this, but still getting nan for places where even one of the column entries is blank in the excel. For instance in the first row, both 'B' and 'C' columns are blank, so I'm getting nan nan. In the 2nd row 'B' has 'somevalue', 'C' is blank so the result that I get is somevalue nan – Pv66 Sep 06 '19 at 12:52
  • No I can't skip any rows, since column 'A' has necessary information. – Pv66 Sep 06 '19 at 12:57
  • I made an EDIT to my answer – nacho Sep 06 '19 at 14:24
  • Thanks nacho, the trick was to replace the NaN's before performing the concatenation operation. Worked like a charm. – Pv66 Sep 09 '19 at 05:44
0

Here's two ways of doing that.

1) Check this documentation.

df.fillna('',inplace=True)

2) Additionally, you can use replace method.

df = df.replace(numpy.nan, '', regex=True)

I hope it helps.

Nijat Mursali
  • 483
  • 1
  • 4
  • 15