3

I'm programming with Python 2.7.6 using numpy. I have this division between two numpy matrixes V/np.dot(W,H). Sometimes happens that the denominator has some cell values equal to 0, so i get a Runtime error. I would like to implement a safe division in a efficient way. How can i write a code that performs the Matrix division and for the elements where the denominator is equal to 0 puts 0 in the output Matrix?

TooTone
  • 6,741
  • 3
  • 29
  • 57
user3528716
  • 129
  • 2
  • 9
  • 1
    Calculate the denominator first, do whatever check you need to do and handle the different cases using an if? – Bas Swinckels Apr 13 '14 at 10:47
  • For example: suppose `V=[[1 2 3][4 5 6]]` and `np.dot(W,H)=[[1 0 3][0 5 6]]`. I would like to have as result `[[1 0 1][0 1 1]]`. I don't get how to do with seterr. – user3528716 Apr 13 '14 at 14:18

3 Answers3

2

Numpy actually allows you to set what you'd like to do in the case of a divide by zero error - see seterr. I believe this is a global flag, though - I'm not aware of a more localized solution - if it's an issue I suppose you can just set seterr before and after your safe division.

Daryl
  • 397
  • 1
  • 12
  • `seterr` set to `ignore` will set the division by zero elements to inf. It would be nice if you could add a snippet that would change these value to zero as per OP's question. – liori Apr 13 '14 at 14:08
  • 1
    @Daryl seterr is global, but `errstate` is a context manager that lets you set the same things for a patch of code. – Adrian Ratnapala Apr 13 '14 at 17:46
1

Simply search for elements in the denominator that are zero and replace them with infinity.

D = np.dot(W,H)
D[D==0] = float('inf')
result = V / D

This approach is slower than a plain result = V / D without checking for zeros using D[D==0] = float('inf'), but it gets better with increasing matrix size. With a 30x30 matrix it takes three times as long, and with a 250x250 matrix it takes twice as long, and as n increases further it approaches 1.8 times as long. And it seems to be about 10% faster than changing the way that floating point exceptions are handled as per Daryl's answer and Adrian's answer.

One thing to bear in mind is that with floating point numbers and lack of precision you may have elements in the denominator that should be zero but aren't quite, and it's easy to incorporate that as follows

epsilon = 1e-8
D[np.abs(D)<epsilon] = float('inf')
Community
  • 1
  • 1
TooTone
  • 6,741
  • 3
  • 29
  • 57
  • Is `np.place(D, D==0, float('inf'))` faster than `D[D==0] = float('inf')`? – Adrian Ratnapala Apr 13 '14 at 17:44
  • @Adrian Ratnapala Good point! There's not a lot in it but if anything the `D[D==0]` way is a little faster. It's certainly simpler and I will change my answer: thankyou. – TooTone Apr 13 '14 at 17:51
  • I am just wondering why `.place` exists. I suppose it might be the "real" method, and the operator overload is just syntactic sugar which calls `.place` whenever the index is a boolean array. – Adrian Ratnapala Apr 13 '14 at 18:02
  • @AdrianRatnapala it's allows you to assign a sequence of values across to a non-contiguous set of elements in the array (cycling through the values if necessary). It's more general than setting a single value-- I shouldn't have used it for this. – TooTone Apr 13 '14 at 18:07
  • I thought you could do that with binary-array subscripts too. At least you can with MATLAB, I might be making unwarranted assumptions about numpy. – Adrian Ratnapala Apr 14 '14 at 07:35
1

Though you say "matrix", I assume you really want arrays since you want element-wise division. I would just do the division inside a context manager that suppresses the div0 errors. Then I would fix up the result.

# Assume V and D are arrays of the same shape
with np.errstate(divide='ignore'): 
    # division errors suppressed only within this block
    quot = V / D
    quot[D == 0] = 0

My gut tells me this is fast because it mostly keeps data in its original shape. But I have never compared it with alternative approaches.

Adrian Ratnapala
  • 5,069
  • 1
  • 25
  • 38
  • this is a nice approach because it solves the global `seterr` problem. Nevertheless in my tests this approach (and also using `seterr` globally) are a little over 10% slower than the approach in my answer (including replacing `np.place` with `D[D==0]=` as you suggested). My guess is that the floating point exceptions still have to be caught and discarded within the hardware. – TooTone Apr 13 '14 at 18:11
  • @TooTone I suspect you are correct about why Setting D is faster. The more important difference is that my solution keeps D unchanged, while yours corrects it to be consistent with the final result. Either approach might be better depending on the application. – Adrian Ratnapala Apr 14 '14 at 07:34
  • that's a good point. In this case it's ok to change `D` because it's a temporary calculated by `dot`. But in general it won't be. – TooTone Apr 14 '14 at 11:32