Speeding up newton-raphson in pandas/python

Question

I'm currently iterating through a very large set of data ~85GB (~600M lines) and simply using newton-raphson to compute a new parameter. As of right now my code is extremely slow, any tips on how to speed it up? The methods from BSCallClass & BSPutClass are closed-form, so there's nothing really to speed up there. Thanks.

class NewtonRaphson:

    def __init__(self, theObject):
        self.theObject = theObject

    def solve(self, Target, Start, Tolerance, maxiter=500):
        y = self.theObject.Price(Start)
        x = Start
        i = 0
        while (abs(y - Target) > Tolerance):
            i += 1
            d = self.theObject.Vega(x)
            x += (Target - y) / d
            y = self.theObject.Price(x)
            if i > maxiter:
                x = nan
                break
        return x

    def main():
        for row in a.iterrows():
            print row[1]["X.1"]
            T = (row[1]["X.7"] - row[1]["X.8"]).days
            Spot = row[1]["X.2"]
            Strike = row[1]["X.9"]
            MktPrice = abs(row[1]["X.10"]-row[1]["X.11"])/2
            CPflag = row[1]["X.6"]

            if CPflag == 'call':
                option = BSCallClass(0, 0, T, Spot, Strike)
            elif CPflag == 'put':
                option = BSPutClass(0, 0, T, Spot, Strike)

            a["X.15"][row[0]] = NewtonRaphson(option).solve(MktPrice, .05, .0001)

EDIT:

For those curious, I ended up speeding this entire process significantly by using the scipy suggestion, as well as using the multiprocessing module.

Iterating row-by-row is not especially efficient since Series objects have to be created. — root, Jan 18 '13 at 22:29
So I'm not exactly sure how to vectorize this, that's the thing. — ast4, Jan 18 '13 at 22:38

unutbu · Accepted Answer · 2013-01-18T22:59:08.463

2

Don't code your own Newton-Raphson method in Python. You'll get better performance using one of the root finders in scipy.optimize such as brentq or newton. (Presumably, if you have pandas, you'd also install scipy.)

Back of the envelope calculation:

Making 600M calls to brentq should be manageable on standard hardware:

import scipy.optimize as optimize
def f(x):
    return x**2 - 2

In [28]: %timeit optimize.brentq(f, 0, 10)
100000 loops, best of 3: 4.86 us per loop

So if each call to optimize.brentq takes 4.86 microseconds, 600M calls will take about 4.86 * 600 ~ 3000 seconds ~ 1 hour.

newton may be slower, but still manageable:

def f(x):
    return x**2 - 2
def fprime(x):
    return 2*x

In [40]: %timeit optimize.newton(f, 10, fprime)
100000 loops, best of 3: 8.22 us per loop

edited Jan 18 '13 at 22:59

answered Jan 18 '13 at 22:29

unutbu

711,858
148
1,594
1,547

So AFAIK, you're unable to set the target w/ scipy's newton function. The results were less than satisfactory using brentq (I whipped something up last night using it). – ast4 Jan 18 '13 at 22:35
The `Target` is just a constant, so instead of trying to find the root for `f(x) = 0`, you'd define `g(x) = f(x) - Target` and apply `newton` to `g`. – unutbu Jan 18 '13 at 22:40

Speeding up newton-raphson in pandas/python

1 Answers1