4

This is not homework. I am interested in setting up a simulation of a coin toss in R. I would like to run the simulation for a week. Is there a function in R that will allow me to start and stop the simulation over a time period such as a week? If all goes well, I may want to increase the length of the simulation period.

For example:

x <- rbinom(10, 1, 1/2)

So to clarify, instead of 10 in the code above, how do I keep the simulation going for a week (number of trials in a week versus set number of trials)? Thanks.

Frank Zafka
  • 799
  • 8
  • 29
  • 5
    Why on earth would you want to do this? Surely the outcome depends on the machine speed of your computer? – Andrie Jun 22 '12 at 14:12
  • Does it really matter why? I would like to know if it's possible. I understand it's a mundane request. This will help me part-way to doing what I am hoping to do. – Frank Zafka Jun 22 '12 at 14:13
  • 1
    It does, if you want me to help you. – Andrie Jun 22 '12 at 14:14
  • Would you have to take into account the coin? some uk coins are weighted ever so slightly heaver on the 'heads' side – atmd Jun 22 '12 at 14:14
  • I want to do this, because I am interested in whether it is possible. And I have a spare Raspberry Pi sitting next to me, waiting to try it out. – Frank Zafka Jun 22 '12 at 14:16
  • 3
    Put it in a while loop that checks the system time. – joran Jun 22 '12 at 14:16
  • atmd > no, the type of coin is not of interest. – Frank Zafka Jun 22 '12 at 14:17
  • 1
    Running this type of simulation is a standard procedure in paranormal research, where you try to force a random generator to go wild by thinking hard. Last time I tried an implementation on Windows could be manipulated, but Linux is stable (Dirk will love this :-) – Dieter Menne Jun 22 '12 at 15:51
  • @DieterMenne Well you've read my mind anyway! wink. – Frank Zafka Jun 22 '12 at 16:04
  • 3
    @RSoul: an issue nobody really has mentioned to you, I think, is that running a week's worth of "coin flips" is a *LOT* of coin flips. Since these coin flips are produced from a pseudo-random number generator, depending on which number generator is used and how it is used, it is possible that during the week you loop through the entire period of the generator. In other words, it is possible that your simulations uses all the pseudo-randomness of your computer and starts reusing it all over again. – Jérémie Jul 15 '12 at 00:16

3 Answers3

11

Here is code that will continue to run for three seconds, then stop and print the totals.

x <- Sys.time()
duration <- 3 # number of seconds
heads <- 0
tails <- 0

while(Sys.time() <= x + duration){
  s <- sample(0:1, 1)
  if(s == 1) heads <- heads+1 else tails <- tails+1
  cat(sample(0:1, 1))
}
cat("heads: ", heads)
cat("tails: ", tails)

The results:

001100111000011010000010110111111001011110100110001101101010 ...
heads:  12713
tails:  12836

Note of warning:

At the speed of my machine, I bet that you get a floating point error long before the end of the week. In other words, you may get to the maximum value your machine allows you to store as an integer, double, float or whatever you are using, and then your code will crash.

So you may have to build in some error checking or rollover mechanism to protect you from this.


For an accelerated illustration of what will happen, try the following:

x <- 1e300
while(is.finite(x)){
  x <- x+x
  cat(x, "\n")
}

R deals with the floating point overload gracefully, and returns Inf.

So, whatever data you had in the simulation is now lost. It's not possible to analyse infinity to any sensible degree.

Keep this in mind when you design your simulation.

Andrie
  • 163,419
  • 39
  • 422
  • 472
  • Thanks @Andrie. That does seem to be what I am after. I realise it was an odd request, but I like learning these aspects of R. Just curious. Shall I ask another question about running out of memory etc, or do you have any ideas? If I run this for a week, is it going to be analysable? – Frank Zafka Jun 22 '12 at 14:28
  • 3
    This is not about memory management. It's about how computers store values. Read up about integers and floating point values in any programmers manual, e.g. at http://en.wikipedia.org/wiki/Floating_point. In my code, I store only two values. This will never run out of memory. But you may still get a floating point error (or whatever R's equivalent of that is.) – Andrie Jun 22 '12 at 14:33
  • The machine doing the simulating is on a network with access to terrabyte storage. I assume it is just a case of shifting data around the network, so as not to overwhelm system. My actual project is only tangentially related, but all ideas go into the cooking pot. – Frank Zafka Jun 22 '12 at 14:36
  • 1
    The problem is that the vector of values get too long, I think the maximum length is 2^31-1. What you could do is save and reset the vector once in a while. – Paul Hiemstra Jun 22 '12 at 14:44
  • Okay. I did understand that this was going to require some planning. I guess I will go away and plan how to keep the process going for the week. – Frank Zafka Jun 22 '12 at 14:45
  • 1
    An option could be to not store the reslts, but only the summary statistics, i.e. The number of heads or tails. – Paul Hiemstra Jun 22 '12 at 14:47
  • 1
    @RSoul If you dump your result to a file from time to time, you're in the clear. – Roman Luštrik Jun 22 '12 at 14:49
  • No. I definitely want all the data. TBH, I was actually doing dice roll data (1/6), so I want face data for each trial. I will probably just try and write each trial to a text file, no? – Frank Zafka Jun 22 '12 at 14:50
  • 1
    @RSoul - I think you could add some logic in your while loop to check the overall length of the vector. When it gets to a certain point, write it to a file, reset vector, and start over again. With the speed of modern computers, this could be a very large set of data to analyze at the end...but I imagine you'll figure that out. – Chase Jun 22 '12 at 14:51
  • 1
    The point I'm trying to make is that even your summary data may reach `Inf` in a short amount of time, depending on the speed of your computer. See my modified example, that reaches `Inf` in about 16 steps. – Andrie Jun 22 '12 at 14:51
  • @Andrie, but if I write each coin flip trial to a text file, that'll be okay? At least the data recording side of it. – Frank Zafka Jun 22 '12 at 14:53
  • Perhaps. But you're more than likely to run into this problem: http://stackoverflow.com/q/11139315/602276 – Andrie Jun 22 '12 at 15:31
  • Well I'm already running into problems, but isn't that half the fun? – Frank Zafka Jun 22 '12 at 16:05
  • @RSoul That's more than half the fun. Good luck! – Andrie Jun 22 '12 at 16:27
  • @RSoul You might want to look at the package `gmp` to avoid the limits of `integer` arithmetic. – James Jul 09 '12 at 11:31
3

While now is smaller than a week later time stamp append to x rbinmo(1,1,1/2)

R> week_later <- strptime("2012-06-22 16:45:00", "%Y-%m-%d %H:%M:%S")
R> x <- rbinom(1, 1, 1/2) // init x
R> while(as.numeric(Sys.time()) < as.numeric(week_later)){
R>   x <- append(x, rbinom(1, 1, 1/2))
R> }
jonbros
  • 533
  • 5
  • 14
  • 1
    Yes, but this will surely exceed your computer's memory before a week is up. The OP really does need to provide the specific details of what they're simulating to get a complete answer. – joran Jun 22 '12 at 14:27
  • That seems to me to be 2 questions though. How to run for a week? How to deal with memory issues, running this calculation for a week? Damned if you combine the two questions, and damned if you don't. – Frank Zafka Jun 22 '12 at 14:30
  • 4
    Just append each value to a file. Since the only requirement is to keep the machine running for a week ... – Roland Jun 22 '12 at 14:43
  • Yeah, that was a thought I had. Will go and research it. – Frank Zafka Jun 22 '12 at 14:47
0

You may be interested in the fairly new package harvestr by Andrew Redd. It splits a task into pieces (the idea being that pieces could be run in parallel). The part of the package that applies to your question is that it caches results of the pieces that have already been processed, so that if the task is interupted and restarted then those pieces that have finished will not be rerun, but it will pick up on those that did not complete (pieces that were interupted part way through will start from the beginning of that piece).

This may let you start and stop the simulation as you request.

Greg Snow
  • 45,559
  • 4
  • 73
  • 98