Pandas Replace NaN with blank/empty string

Question

I have a Pandas Dataframe as shown below:

    1    2       3
 0  a  NaN    read
 1  b    l  unread
 2  c  NaN    read

I want to remove the NaN values with an empty string so that it looks like so:

    1    2       3
 0  a   ""    read
 1  b    l  unread
 2  c   ""    read

fantabolous · Answer 1 · 2020-06-05T08:05:55.050

444

df = df.fillna('')

or just

df.fillna('', inplace=True)

This will fill na's (e.g. NaN's) with ''.

If you want to fill a single column, you can use:

df.column1 = df.column1.fillna('')

One can use df['column1'] instead of df.column1.

edited Jun 05 '20 at 08:05

answered Feb 08 '15 at 05:44

fantabolous

15,954
5
46
45

10

@Mithril - `df[['column1','column2']] = df[['column1','column2']].fillna('')` – elPastor Oct 12 '17 at 01:29
1

This is giving me `SettingWithCopyWarning` – jss367 Nov 11 '20 at 22:44
2

@jss367 That's not due to this code, but rather because you've earlier created a partial view of a larger df. Very good answer here https://stackoverflow.com/a/53954986/3427777 – fantabolous Jan 26 '21 at 11:54

score 341 · Accepted Answer · edited Mar 08 '17 at 14:05

341

import numpy as np
df1 = df.replace(np.nan, '', regex=True)

This might help. It will replace all NaNs with an empty string.

edited Mar 08 '17 at 14:05

Ninjakannon

3,283
5
42
65

answered Nov 10 '14 at 06:40

nEO

4,597
2
18
24

1

what library does `np.nan` come from? I can't use it – CaffeineConnoisseur Aug 05 '16 at 22:33
10

@CaffeineConnoisseur: `import numpy as np`. – John Zwinck Aug 08 '16 at 21:56
40

@CaffeineConnoisseur - or just `pd.np.nan` if you don't want to `import numpy` as well. – elPastor Oct 12 '17 at 01:27
1

This also allows the Dict to be saved as a string in the row of a .csv and then subsequently read back into a DataFrame using the `pd.DataFrame.from_dict(eval(_string_))` – yeliabsalohcin Aug 07 '18 at 11:02
6

Also useful to mention the `... inplace=True` option. – smci May 24 '19 at 23:02
1

@CaffeineConnoisseur,@elPastor - `pandas 1.0.3` warns of `pandas.np` deprecation in future versions. It was nice having it! – Gathide May 05 '20 at 13:11
You can use `float('nan')` instead of `np.nan`. – Acumenus May 15 '20 at 00:01
You can also use `pd.NA` instead of `pd.np.nan` since 1.0.0 – lucidyan Mar 10 '21 at 15:58

score 119 · Answer 3 · edited Sep 16 '19 at 08:40

119

If you are reading the dataframe from a file (say CSV or Excel) then use :

df.read_csv(path , na_filter=False)
df.read_excel(path , na_filter=False)

This will automatically consider the empty fields as empty strings ''

If you already have the dataframe

df = df.replace(np.nan, '', regex=True)
df = df.fillna('')

edited Sep 16 '19 at 08:40

Mel

4,929
10
33
39

answered Jul 19 '17 at 15:16

Natesh bhat

7,780
5
52
90

na_filter is not available on read_excel() http://pandas.pydata.org/pandas-docs/stable/search.html?q=na_filter&check_keywords=yes&area=default – Marjorie Roswell Jul 31 '17 at 02:39
i have used it in my application . It does exist but for some reason , they haven't given this argument in the docs . It works nice for me though without errors. – Natesh bhat Aug 01 '17 at 06:40
It works, i'm using it in parse `xl.parse('sheet_name', na_filter=False)` – Dmitrii Nov 22 '17 at 17:33

score 10 · Answer 4 · edited May 24 '19 at 23:29

10

Use a formatter, if you only want to format it so that it renders nicely when printed. Just use the df.to_string(... formatters to define custom string-formatting, without needlessly modifying your DataFrame or wasting memory:

df = pd.DataFrame({
    'A': ['a', 'b', 'c'],
    'B': [np.nan, 1, np.nan],
    'C': ['read', 'unread', 'read']})
print df.to_string(
    formatters={'B': lambda x: '' if pd.isnull(x) else '{:.0f}'.format(x)})

To get:

   A B       C
0  a      read
1  b 1  unread
2  c      read

edited May 24 '19 at 23:29

smci

26,085
16
96
138

answered Jun 21 '18 at 22:41

Steve Schulist

761
1
9
15

4

`print df.fillna('')` by itself (without doing `df = df.fillna('')`) doesn't modify the original either. Is there a speed or other advantage to using `to_string`? – fantabolous Nov 27 '18 at 03:10
Fair enough, `df.fillna('')` it is! – Steve Schulist Nov 28 '18 at 15:35
@shadowtalker: Not necessarily, it would only be the correct answer if the OP wanted to keep the df in one format (e.g. more computationally-efficient, or saving memory on unnecessary/empty/duplicate strings), yet render it visually in a more pleasing way. Without knowing more about the use-case, we can't say for sure. – smci May 24 '19 at 23:05

Vineesh TP · Answer 5 · 2021-04-30T06:21:37.340

3

Try this,

add inplace=True

import numpy as np
df.replace(np.NaN, '', inplace=True)

edited Apr 30 '21 at 06:21

answered Aug 23 '19 at 12:27

Vineesh TP

7,025
9
54
102

This is not an empty string, `''` and `' '` are not equivalent, While the first is treated as `False`, the value used above will be treated as `True`. – suvayu Apr 28 '21 at 09:26

score 2 · Answer 6 · answered Jun 28 '19 at 09:29

2

using keep_default_na=False should help you:

df = pd.read_csv(filename, keep_default_na=False)

answered Jun 28 '19 at 09:29

Bendy Latortue

301
3
6

score 0 · Answer 7 · edited May 17 '19 at 11:11

0

If you are converting DataFrame to JSON, NaN will give error so best solution is in this use case is to replace NaN with None.
Here is how:

df1 = df.where((pd.notnull(df)), None)

edited May 17 '19 at 11:11

taras

5,216
9
32
41

answered Mar 15 '18 at 20:48

Dinesh Khetarpal

352
2
4

score 0 · Answer 8 · answered Jul 04 '19 at 04:07

I tried with one column of string values with nan.

To remove the nan and fill the empty string:

df.columnname.replace(np.nan,'',regex = True)

To remove the nan and fill some values:

df.columnname.replace(np.nan,'value',regex = True)

I tried df.iloc also. but it needs the index of the column. so you need to look into the table again. simply the above method reduced one step.

Pandas Replace NaN with blank/empty string

8 Answers8

Linked