3

I'm currently iterating through a very large set of data ~85GB (~600M lines) and simply using newton-raphson to compute a new parameter. As of right now my code is extremely slow, any tips on how to speed it up? The methods from BSCallClass & BSPutClass are closed-form, so there's nothing really to speed up there. Thanks.

class NewtonRaphson:

    def __init__(self, theObject):
        self.theObject = theObject

    def solve(self, Target, Start, Tolerance, maxiter=500):
        y = self.theObject.Price(Start)
        x = Start
        i = 0
        while (abs(y - Target) > Tolerance):
            i += 1
            d = self.theObject.Vega(x)
            x += (Target - y) / d
            y = self.theObject.Price(x)
            if i > maxiter:
                x = nan
                break
        return x

    def main():
        for row in a.iterrows():
            print row[1]["X.1"]
            T = (row[1]["X.7"] - row[1]["X.8"]).days
            Spot = row[1]["X.2"]
            Strike = row[1]["X.9"]
            MktPrice = abs(row[1]["X.10"]-row[1]["X.11"])/2
            CPflag = row[1]["X.6"]

            if CPflag == 'call':
                option = BSCallClass(0, 0, T, Spot, Strike)
            elif CPflag == 'put':
                option = BSPutClass(0, 0, T, Spot, Strike)

            a["X.15"][row[0]] = NewtonRaphson(option).solve(MktPrice, .05, .0001)

EDIT:

For those curious, I ended up speeding this entire process significantly by using the scipy suggestion, as well as using the multiprocessing module.

ast4
  • 771
  • 7
  • 18

1 Answers1

2

Don't code your own Newton-Raphson method in Python. You'll get better performance using one of the root finders in scipy.optimize such as brentq or newton. (Presumably, if you have pandas, you'd also install scipy.)


Back of the envelope calculation:

Making 600M calls to brentq should be manageable on standard hardware:

import scipy.optimize as optimize
def f(x):
    return x**2 - 2

In [28]: %timeit optimize.brentq(f, 0, 10)
100000 loops, best of 3: 4.86 us per loop

So if each call to optimize.brentq takes 4.86 microseconds, 600M calls will take about 4.86 * 600 ~ 3000 seconds ~ 1 hour.


newton may be slower, but still manageable:

def f(x):
    return x**2 - 2
def fprime(x):
    return 2*x

In [40]: %timeit optimize.newton(f, 10, fprime)
100000 loops, best of 3: 8.22 us per loop
unutbu
  • 711,858
  • 148
  • 1,594
  • 1,547
  • So AFAIK, you're unable to set the target w/ scipy's newton function. The results were less than satisfactory using brentq (I whipped something up last night using it). – ast4 Jan 18 '13 at 22:35
  • The `Target` is just a constant, so instead of trying to find the root for `f(x) = 0`, you'd define `g(x) = f(x) - Target` and apply `newton` to `g`. – unutbu Jan 18 '13 at 22:40