0

I want to create random pairs of numbers within 2 ranges.

So for example if I want 3 random pairs of numbers where 10 < n1 < 20 and 30 < n2 < 50 then an acceptable output would be this: [[11,35],[15,30],[15,42]] but not [[11,35],[11,35],[12,39]]

I would like an efficient (both computationally and memory wise) algorithm to do this. The language doesn't really matter because I can adapt it later (although Python would be preferred).

So far the best idea I have had is to create a dictionary with all the possible numbers in n1 and as values a list of the numbers which have been used in n2. Then I can just pick a random n1 and find a number which hasn't been used in n1[n2] set.

This isn't very efficient space wise though and I'm hoping for something better. It also seems to be computationally inefficient to find a number not in n1[n2] many times.

I could also do the opposite and have the dictionary populated with all the numbers not used and just pop a random number off the list. But this would use much more space.

Is there any efficient way to do this? Is this a common problem?

Edit: It would be good if this could easily be expanded to more dimensions (so sets of N numbers). But this isn't really needed yet.

KNejad
  • 2,011
  • 2
  • 10
  • 21
  • Do the generated pairs conform to a uniform distribution? – GZ0 Sep 22 '19 at 14:12
  • "Should they" or "Do they" in the one I already created? Well they should but I haven't really checked that in the one I created yet since it's not working very well due to the complexity issues mentioned – KNejad Sep 22 '19 at 14:14
  • Yeah, I mean the "Should they", i.e. the expected results. – GZ0 Sep 22 '19 at 14:16
  • The infamous question of the non-repeating random numbers. I always wonder if there is a real reason for not wanting repetition or if it is just the false belief that repeating outcomes is not random. – conditionalMethod Sep 22 '19 at 16:28
  • In this case I'm starting to think its okay-ish to have repeating values. The idea was that I want to train a neural network and I was generating some input data for it. I didn't want repetition so that when I run the validation data it wouldn't be fed values its already seen – KNejad Sep 22 '19 at 18:25

1 Answers1

1

An integer pair (x, y) in [min_x, min_x + s) X [min_y, min_y + t) can be mapped to an integer m within the 1D space [min_x * t, (min_x + s) * t) by calculating m = x * t + y - min_y. The inverse mapping from m to (x, y) can be achieved by (m // t, min_y + m % t) in Python.

Therefore the problem is transformed to choosing multiple values from [min_x * t, (min_x + s) * t) without replacement (i.e. no duplicates in the returned sequence). This can be done by simply calling the random.sample function in Python. According to the doc, the underlying implementation is space efficient for sequence inputs. So the entire problem can be done in Python as shown in the following:

from random import sample

# max_x and max_y are exclusive while min_x and min_y are inclusive
t = max_y - min_y
sampled_pairs = [(m//t, min_y + m%t) for m in sample(range(min_x * t, max_x * t), k=3)]
GZ0
  • 3,288
  • 1
  • 4
  • 19
  • I was thinking of this (it builds on my solution which I deleted). But I was worried that `range(min_x * t, max_x *t)` will use a lot of space. I guess I was wrong. I'll read that function at some point and see if I can figure out how it works. – KNejad Sep 22 '19 at 18:38
  • Just to point out though: After reading the source code for `sample` it seems that if the sample size is large enough it will create a list of the input range. In that scenario it would be just as efficient as the methods I've proposed. In the event that the sample size I want is low then it will be more efficient though – KNejad Sep 22 '19 at 18:57
  • @KNejad In Python 3 `range(...)` returns range objects which are basically a special class of objects that have `start`, `stop`, and `step` fields so it is efficient (see [this post](https://stackoverflow.com/questions/30081275/why-is-1000000000000000-in-range1000000000000001-so-fast-in-python-3)). As for the `sample` function, the implementation creates a list of the input when the size of the set needed for holding the samples is larger than that. – GZ0 Sep 22 '19 at 20:38