4

I was experimenting with the $RANDOM variable in a Unix shell and noticed something peculiar. I ran the following command, which reads $RANDOM in a loop 100k times and then pipes the output to "uniq" to find the duplicates.

$ for i in {1..100000}; do echo $RANDOM; done | uniq -d

I ran the command above 7 times, and the same two numbers (4455 and 4117) were repeated all 7 times. The screenshot below shows the command line output.

kali@kali:~% for i in {1..100000}; do echo $RANDOM; done | uniq -d
4455
4117
kali@kali:~% for i in {1..100000}; do echo $RANDOM; done | uniq -d
4455
4117
kali@kali:~% for i in {1..100000}; do echo $RANDOM; done | uniq -d
4455
4117
kali@kali:~% for i in {1..100000}; do echo $RANDOM; done | uniq -d
4455
4117
kali@kali:~% for i in {1..100000}; do echo $RANDOM; done | uniq -d
4455
4117
kali@kali:~% for i in {1..100000}; do echo $RANDOM; done | uniq -d
4455
4117
kali@kali:~% for i in {1..100000}; do echo $RANDOM; done | uniq -d
4455
4117

See: https://i.stack.imgur.com/5bpEe.png

I also opened another terminal window and repeated the process. In the second terminal, the numbers were different, but repeated from in a similar fashion. This makes me wonder about when the entropy of the $RANDOM variable, and how it is seeded.

My guess is that it is re-seeded whenever bash is invoked, but I was wondering if anyone has any info about why the same values are repeated when I repeat the command in a single terminal window.

mattst
  • 10,850
  • 4
  • 26
  • 38
jmg
  • 457
  • 6
  • 11
  • I tried it on my Ubuntu linux subsystem for Windows and I don't get that result. Do you have the same result in another terminal, when you reboot your machine, ...? – Dominique Aug 23 '20 at 08:33
  • 4
    What platform are you using? `The screenshot` Please do not post text in images. Please post the text as text in your question instead. Does the numbers differ in a new shell? Anyway can't reproduce on bash5.0.17 on archlinux5.8.1. – KamilCuk Aug 23 '20 at 08:38
  • Have a look at https://tldp.org/LDP/abs/html/randomvar.html – kvantour Aug 23 '20 at 08:39
  • 1
    @jmg You tagged both zsh and bash. So are you testing the code in bash __or__ zsh? Or do you can repeat the issue on both bash and zsh? – KamilCuk Aug 23 '20 at 08:57
  • 1
    I also suggest you consider that [Kali is so different from other Linux distributions](https://stackoverflow.com/questions/tagged/kali-linux). Being a special-purpose Linux distribution for penetration testing and security auditing, Kali Linux may have special random number generators for this exact reason. – Léa Gris Aug 23 '20 at 09:28
  • From `man bash`: `RANDOM: Each time this parameter is referenced, a random integer between 0 and 32767 is generated. The sequence of random numbers may be initialized by assigning a value to RANDOM.` – Cyrus Aug 23 '20 at 11:28
  • @Dominique - you may need to increase iterations to 100k or 1million. – jmg Aug 24 '20 at 03:36
  • @KamilCuk - Yes, the repeated numbers change in a new shell. I actually mentioned that in the original post. It seems that the $RANDOM variable is re-seeded when you spawn a new shell instance. You may also need to increase the number of iterations to 100k or 1million. – jmg Aug 24 '20 at 03:37
  • 1
    @KamilCuk - I actually performed the test in zsh only. interestingly, it doesn't happen for me in bash either. That has me a little curious about what's different between bash and zsh. – jmg Aug 24 '20 at 03:40
  • @LéaGris - Yes, Kali is purpose-built, but my understanding is that it is a standard Debian kernel with a collection of packages for pen-testing. So, it should function like any regular Debian-based distro. I'm running it in a VM ,but I don't really think that's relevant here. – jmg Aug 24 '20 at 03:59
  • Another piece of anecdotal information: When I run the code in a script (instead of directly in the prompt), the numbers aren't repeated. – jmg Aug 24 '20 at 04:22
  • @jmg : Whenever an interactive zsh is spawned, I let it execute (in my startup files) a `RANDOM=$(($(od -vAn -N2 -tu2 < /dev/urandom)))` to re-seed it from _urandom_. Try whether this helps. Don't know about bash. – user1934428 Aug 24 '20 at 08:20
  • @jmg Two things to mention; Firstly in your posted code there are 100k iterations not 10k and secondly `uniq -d` does not find 'duplicates', it finds duplicated lines that are ADJACENT to each other. To remove all duplicated lines (ignoring their position) `uniq's` input must first be sorted, e.g. `... | sort | uniq -d`. – mattst Aug 24 '20 at 13:30
  • @mattst - yeah, I noticed that I missed a zero in the post, but I can't edit for some reason. Thanks for the correction about uniq - you're correct about that. Still very strange behavior though. It appears that it is a bug (a couple of answers down) – jmg Oct 05 '20 at 01:09
  • @jmg I've edited your post so that it now says `...which reads $RANDOM in a loop 100k times...` in the 1st paragraph, i.e. 10k changed to 100k. You need 2,000 reputation for *full* editing privileges, i.e. anybody's question or answer, but I don't know why you can't edit your own post. – mattst Oct 05 '20 at 16:51

2 Answers2

4

Pseudorandom number generators are not perfect. The Lehmer random number generator is used in bash sources with the "standard" constants:

x(n+1) = 16807 * x(n) mod (2**31 - 1)

moreover bash limits the output to 15 bits only:

#  define BASH_RAND_MAX 32767
...
return ((unsigned int)(rseed & BASH_RAND_MAX));

With the seed your shell has been seeded, it just so happens that numbers 4455 and 4117 appear one after another in consecutive output of 10000 random numbers. Nothing surprising there really. You could calculate the seed to get two consecutive numbers knowing that:

# We know that lower 15 bits of previous number are equal to 4455
x(n) mod 32768 = 4455
# We know that lower 15 bits of previous number are equal to 4455
x(n+1) mod 32768 = 4455
# We know the relation between next and previous number
x(n+1) = 16807 * x(n) mod (2**31 - 1)
# You could find x(n)

Why are the same $RANDOM numbers repeated?

Because the used pseudorandom generator method in bash sources with the current seed in your shell happens to repeat the same number.

KamilCuk
  • 69,546
  • 5
  • 27
  • 60
  • Yeah, that's in line with what I expected. Thanks for explaining :) – jmg Aug 24 '20 at 03:31
  • I performed this test in zsh, but it doesn't happen on the same machine in bash. Any idea what might be different between zsh and bash? – jmg Aug 24 '20 at 03:41
  • kudos for figuring out the seed :) – jmg Aug 24 '20 at 03:51
  • `Any idea what might be different between zsh and bash?` Everything, these are different programs.. Most probably the generator in zsh is different - I guessed you are using bash, because you tagged it. Inspect zsh sources and find out what generator is used there. – KamilCuk Aug 24 '20 at 06:45
3

This is due to a zsh bug / "behaviour" for RANDOM in subshells. This bug doesn't appear in bash.

echo $RANDOM # changes at every run  
echo `echo $RANDOM` # always return the same value until you call the first line

Because RANDOM is seeded by its last value, but in a subshell the value obtained is not updated in the main shell.

In man zshparam:

RANDOM <S>
A  pseudo-random  integer  from 0 to 32767, newly generated each
time this parameter is referenced.  The random number  generator
can be seeded by assigning a numeric value to RANDOM.

The   values   of   RANDOM   form   an  intentionally-repeatable
pseudo-random sequence; subshells  that  reference  RANDOM  will
result  in  identical  pseudo-random  values unless the value of
RANDOM is referenced or seeded in the parent  shell  in  between
subshell invocations.

There is even crazier because calling uniq creates a subshell

for i in {1..10}; do echo $RANDOM; done # changes at every run 
for i in {1..10}; do echo $RANDOM; done | uniq # always the same 10 numbers

Source : Debian bug report 828180

lolesque
  • 7,618
  • 4
  • 32
  • 35
  • The $RANDOM variable isn't really intended to be used for anything security-related anyway, so I guess its not a huge deal. – jmg Oct 05 '20 at 01:18