5

I want to make my code run faster for more iterations and runs. Right now my code is too slow, but I don't know what to change to speed it up. I began by coding a kinetic Monte Carlo simulation, then editing it to become a Brownian motion simulation. My current code can't handle 10,000 runs with 10,000 iteration each, which is needed.

import numpy as np
import matplotlib.pyplot as plt
import time
%matplotlib inline

runs = int(input("Enter number of runs: "))
N = int(input("Enter number of iterations per simulation: "))

y = 0
R = 10*1  # R is the rate value
t0 = time.time()
for y in range(runs):  # Run the simulation 'runs' times
    T = np.array([0])
    dt = 0
    x = 0.5  # sets values 
    X = np.array([x])
    t = 0
    i = 0

    while t < N:  # N is the number of iterations per run
        i = i + 1  # i is number of iterations so far
        z = np.random.uniform(-1, 1, 1)  # sets z to be a random number between -1 to 1 size 1

        if z > (1/3):  # if conditions for z for alpha and gamma, beta 
            x = x + 1  # z[]=alpha state then + 1
        elif z < (-1/3):
            x = x-1  # z[]=gamma state then - 1
        elif z < (1/3) and z > (-1/3):
            x = x  # z=beta state then + 0

        X = np.append(X, x)  # adds new X value to original X array
        X[i] += X[i-1] * 0.01 * np.random.normal(0, 1, 1) * np.sqrt(dt)  # for Brownian motion with sigma as 0.01
        Z = np.random.uniform(0, 1)  # sets Z to be a random number between 0 and 1
        dt = 1/R * np.log(1/Z)  # formula for dt; R is the rate value
        t = t + dt  # ITERATED TIME
        T = np.append(T, t)
        plt.plot(T, X, lw='0.5', alpha=0.5)

t1 = time.time()
print("final %.10f seconds " % (t1-t0))
dedObed
  • 1,117
  • 9
  • 17
  • 2
    Please edit your code (indentations mostly) to make sure we interpret it correctly – Tacratis Mar 15 '19 at 21:45
  • 1
    Note: it’d help if you switched to descriptive variable names. Anyway, does `plt.plot` need to be in the inner loop? Seems like it could go after. Also, `T` doesn’t look like it needs to be an `np.array`, so keep `T = []` and use `T.append(t)`. Another thing is it looks like NumPy is being used on single elements for the most part which is not what it’s fast at; you may as well use the standard library. – Ry- Mar 15 '19 at 21:45
  • 4
    Strive to vectorize your code. Instead of e.g. 10,000 runs of single-valued simulations (your `z` is a single number), have a vector of 10,000 simulations. – John Coleman Mar 15 '19 at 21:46
  • @Ry it is in the inner loop to plot multiple graphs on to the same plot, as I will need to show that mean of the plots as we increase the runs the mean should approach 0. – pythonnewbie22 Mar 15 '19 at 21:51
  • I will try to edit it to be more descriptive. – pythonnewbie22 Mar 15 '19 at 21:51
  • 1
    @pythonnewbie22: It looks like you’re plotting the same plot over itself with one new item each time though. Seems like it should be in the outer loop, after the inner loop. – Ry- Mar 15 '19 at 21:52
  • @Ry I will make the changes and see how it performs. – pythonnewbie22 Mar 15 '19 at 21:56
  • @Ry I tried to make the changes you suggested it gave me an error. – pythonnewbie22 Mar 15 '19 at 22:33
  • Welcome to Stackoverflow! I've made vast changes to the code layout, removing non-measured parts and commented code, putting some extra whitespace and so on. However I've tried my best to keep the original spirit, please check that I've not screwed the semantics. While editing, I've noticed that your inner loop is driven... weird: You loop while `t` is less than the fixed `N`. However, the only update to `t` I see is `t += 1/R * np.log(1/np.random.uniform(0, 1))`, which consistently decreases it. Does the loop ever finish for you? – dedObed Mar 15 '19 at 22:34
  • @pythonnewbie22: Which error? What did the code look like after the change? – Ry- Mar 15 '19 at 22:39
  • @dedObed: R’s positive, `log(1 / x)` for `x < 1`’s positive, isn’t it? – Ry- Mar 15 '19 at 22:40
  • @Ry: My bad, I've somehow missed the division in the log. – dedObed Mar 15 '19 at 22:42
  • @dedObed I choose R to a big value like 10^4 it becomes a extremely slow, if I use a value like 10 it quicker and it would finished. For the t part initially used that t=dt+t, do you recommend to change it back to that? – pythonnewbie22 Mar 15 '19 at 22:42
  • @Ry- Error quote "TypeError: append() missing 1 required positional argument: 'values'" I do not understand what it means for the context of my code. – pythonnewbie22 Mar 15 '19 at 22:44
  • @pythonnewbie22: Don't worry, I didn't change in the code, only condensed it here in the comment. Other comments: (1) you set `R = 10` and more importantly (2) you run the inner loop for nondeterministic number steps. If you want to make those given `N` steps, you should loop as `while i < N:`. Which would be more pythonic to write as `for i in range(N):`, dropping the `i=0` and `i=i+1` lines altogether. – dedObed Mar 15 '19 at 22:45
  • @pythonnewbie22: `T.append(t)`, not `T.append()`. (My best guess, because you didn’t show the changes.) – Ry- Mar 15 '19 at 22:46
  • @Ry- I did those changes and deleted the T.array bit it still gives me that. – pythonnewbie22 Mar 15 '19 at 22:58
  • show code and error – Ry- Mar 15 '19 at 23:04
  • @dedObed I did the change of while i – pythonnewbie22 Mar 15 '19 at 23:04
  • @pythonnewbie22. Great! (1) Just note that it was not an optimization, rather a change of behaviour: You should see at the end that now `X` contains `N+1` elements, while before, the number was random. (2) The change to `for i in range(N):` on the other hand won't change anything, just make your code a better expression of your intent :-) – dedObed Mar 15 '19 at 23:07

1 Answers1

1

Here is an excellent example of a fast-running Brownian Motion Monte Carlo Simulation which is less computationally-expensive.

I've done the same thing as you in the past and made each step of each iteration take place within nested loops. Perhaps its the cost of context switching, running through different libraries, or simply running out of memory, but running each step of the code in each iteration definitely brought about slower performance and higher memory use.

In the above example the author creates arrays first and then iterates through them accordingly with a single for loop. All the random numbers are generated and placed into an array at the same time. Then all the Brownian Motion returns are calculated at the same time. etc. (Think of an assembly line - utilizing resources very well on each step and attaining economies of scale.) Critically also, take note that the plt function is run only one time (not within the loop) and only after all the iterations are complete.

This method should allow for a much higher number of iterations on much smaller hardware.

MonteCarloSims
  • 1,568
  • 1
  • 5
  • 17