1

I know there are a lot of posts about this and I have read through a lot of them but I can't seem to get anything to work

                            O         H         L         C   MLR_CQG
DateTime                                                             
2011-09-19 22:00:00   -0.3606   -0.3605   -0.3611   -0.3611 -0.361171
2011-09-19 23:00:00   -0.3611   -0.3611   -0.3614   -0.3614 -0.361202
2011-09-20 00:00:00   -0.3614   -0.3602   -0.3614   -0.3613 -0.361344
2011-09-20 01:00:00                                         -0.361571
2011-09-20 02:00:00   -0.3613   -0.3613   -0.3614   -0.3613 -0.361752
2011-09-20 03:00:00   -0.3613   -0.3601   -0.3613   -0.3607 -0.361806
2011-09-20 04:00:00   -0.3611   -0.3607   -0.3614   -0.3608 -0.361845
2011-09-20 05:00:00   -0.3607   -0.3596   -0.3614   -0.3596 -0.361806
2011-09-20 06:00:00   -0.3603   -0.3594   -0.3604   -0.3601 -0.361796
2011-09-20 07:00:00   -0.3601   -0.3595   -0.3601   -0.3597 -0.361725
2011-09-20 08:00:00   -0.3599   -0.3599   -0.3613   -0.3607 -0.361753
2011-09-20 09:00:00   -0.3609   -0.3603   -0.3641    -0.363 -0.362013
2011-09-20 10:00:00   -0.3631   -0.3617   -0.3636   -0.3634 -0.362320
2011-09-20 11:00:00   -0.3635    -0.363   -0.3647   -0.3643 -0.362667
2011-09-20 12:00:00   -0.3642   -0.3639   -0.3651   -0.3647 -0.362993
2011-09-20 13:00:00   -0.3644   -0.3644   -0.3654    -0.365 -0.363288
2011-09-20 14:00:00   -0.3651   -0.3636   -0.3652   -0.3645 -0.363558
2011-09-20 14:30:00   -0.3646   -0.3636   -0.3649   -0.3641 -0.363725
2011-09-20 15:30:00   -0.3637   -0.3627   -0.3644   -0.3643 -0.363879
2011-09-20 16:30:00   -0.3637   -0.3629   -0.3638   -0.3629 -0.363929
2011-09-20 18:00:00                                         -0.363892
2011-09-20 19:00:00   -0.3636   -0.3627   -0.3637   -0.3627 -0.363812

I'd basically just like to replace the empty cells with the data from the row above it. I've tried this line of code but it doesn't seem to be replacing the empty cells. This is also part of a process that would be run thousands of times - if possible, i'd like to not have to perform the task unless there were empty rows.

rbs_test.replace(r'^\s*$', np.nan, regex=True).fillna(method='ffill')
novawaly
  • 779
  • 1
  • 5
  • 16
  • 1
    the `replace` method returns a new DataFrame, it's not modifying the `rbs_test` frame in-place. Can you try: `rbs_test = rbs_test.replace(r'^\s*$', np.nan, regex=True)`? – David Zemens Dec 04 '18 at 16:31
  • 1
    Alternatively, `rbs_test.replace(r'^\s*$', np.nan, regex=True, inplace=True)` – wbadart Dec 04 '18 at 16:34
  • @DavidZemens Boom!. that did the trick. Didn't know that it wasn't replacing my df. 2 follow up questions if it's not too much trouble. Is there a way to check to see if there are any empty spaces before doing this operation and also - what do each of the r'^\s*$' mean? – novawaly Dec 04 '18 at 16:38
  • IDK regex well enough to answer the second question, but maybe the official regex site will inform you on what the special characters do. If you know these are always going to be zero-length strings, though, I think there's probably no need for regex, you can just `.replace('', np.nan)` (though if your frame also contains mixed data/strings, that might not work). – David Zemens Dec 04 '18 at 16:45
  • For your other query, have a look at the [any](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.any.html) or [all](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.all.html) methods, maybe? – David Zemens Dec 04 '18 at 16:48
  • @DavidZemens for some reason when I try to just use replace('',np.nan) i get this error: – novawaly Dec 04 '18 at 16:54
  • TypeError: unsupported operand type(s) for ** or pow(): 'str' and 'int' – novawaly Dec 04 '18 at 16:54
  • keep using the regex then, if it ain't broken, don't fix it :) – David Zemens Dec 04 '18 at 16:58
  • fair enough. (im new so trying to get as much info soaked up as i can!). Thank you for the help – novawaly Dec 04 '18 at 17:01
  • 1
    Based on limited regex knowledge and quick google, the `\s` means *any whitespace character* (tabs, spaces, line feeds, etc.). The carat means *must start with*, the `$` means *must end with*, and the asterisk specifies that *any number of matching characters* is your pattern. So, it's going to replace any value that consists entirely of whitespace. – David Zemens Dec 04 '18 at 17:08
  • Super helpful. Thank you once again. Blessed be the fruit! – novawaly Dec 04 '18 at 17:13

0 Answers0