Reproducible results in Tensorflow with tf.set_random_seed

Question

I am trying to generate N sets of independent random numbers. I have a simple code that shows the problem for 3 sets of 10 random numbers. I notice that even though I use the tf.set_random_seed to set the seed, the results of different runs do not look alike. Any help or comments are greatly appreciated.

(py3p6) bash-3.2$ cat test.py 
import tensorflow as tf
for i in range(3):
  tf.set_random_seed(1234)
  generate = tf.random_uniform((10,), 0, 10)
  with tf.Session() as sess:
    b = sess.run(generate)
    print(b)

This is the output of the code:

# output :
[9.604688  5.811516  6.4159    9.621765  0.5434954 4.1893444 5.8865128
 7.9785547 8.296125  8.388672 ]
[8.559105  3.2390785 6.447526  8.316823  1.6297233 1.4103293 2.647568
 2.954973  6.5975866 7.494894 ]
[2.0277488 6.6134906 0.7579422 4.6359386 6.97507   3.3192968 2.866236
 2.2205782 6.7940736 7.2391043]

I want something like

[9.604688  5.811516  6.4159    9.621765  0.5434954 4.1893444 5.8865128
 7.9785547 8.296125  8.388672 ]
[9.604688  5.811516  6.4159    9.621765  0.5434954 4.1893444 5.8865128
 7.9785547 8.296125  8.388672 ]
[9.604688  5.811516  6.4159    9.621765  0.5434954 4.1893444 5.8865128
 7.9785547 8.296125  8.388672 ]

Update 1: Indeed the reason I had put the seed initializer within the for loop, was because I want to set them differently (think of it as for different MCMC runs, for instance). This is my code which does the job but I am not sure if it's efficient. Basically I generate a couple random seeds between 0 and 2^32-1, and change the seed in each run. Any help or comments to make it more memory/RAM efficient are greatly appreciated.

import numpy as np
import tensorflow as tf
global_seed = 42
N_chains = 5
np.random.seed(global_seed)
seeds = np.random.randint(0, 4294967295, size=N_chains)

for i in range(N_chains):
    tf.set_random_seed(seeds[i])
    .... some stuff ....
    kernel_initializer = tf.random_normal_initializer(seed=seeds[i])
    .... some stuff
    with tf.Session() as sess:
         .... some stuff .....
 .
 .
 .

I could generate the same result by setting the seed in the operation as well: `tf.set_random_seed(1234)` and `generate = tf.random_uniform((10,), 0, 10, seed=1234)`. I wonder if there is any other way without requiring to set the seed in the operation level. — Mehdi Rezaie, Jul 09 '18 at 16:15

score 23 · Answer 1 · answered Jul 09 '18 at 21:00

In tensorflow, a random operation relies on two different seeds: a global seed, set by tf.set_random_seed, and an operation seed, provided as an argument to the operation. You will find more details on how they relate in the docs.

You have a different seed for each random op because each random op maintains its own internal state for pseudo-random number generation. The reason for having each random generator maintaining its own state is to be robust to change: if they shared the same state, then adding a new random generator somewhere in your graph would change the values produced by all the other generators, defeating the purpose of using a seed.

Now, why do we have this dual system of global and per-op seeds? Well, actually the global seed is not necessary. It is there for convenience: It allows to set all random op seeds to a different and deterministic (if unknown) value at once, without having to go exhaustively through all of them.

Now when a global seed is set but not the op seed, according to the docs,

The system deterministically picks an operation seed in conjunction with the graph-level seed so that it gets a unique random sequence.

To be more precise, the seed that is provided is the id of the last operation that has been created in the current graph. Consequently, globally-seeded random operation are extremely sensitive to change in the graph, in particular to those created before itself.

For example,

import tensorflow as tf
tf.set_random_seed(1234)
generate = tf.random_uniform(())
with tf.Session() as sess:
  print(generate.eval())
  # 0.96046877

Now if we create a node before, the result changes:

import tensorflow as tf
tf.set_random_seed(1234)
tf.zeros(()) # new op added before 
generate = tf.random_uniform(())
with tf.Session() as sess:
  print(generate.eval())
  # 0.29252338

If a node is create after however, it does not affect the op seed:

import tensorflow as tf
tf.set_random_seed(1234)
generate = tf.random_uniform(())
tf.zeros(()) # new op added after
with tf.Session() as sess:
  print(generate.eval())
  # 0.96046877

Obviously, as in your case, if you generate several operations, they will have different seeds:

import tensorflow as tf
tf.set_random_seed(1234)
gen1 = tf.random_uniform(())
gen2 = tf.random_uniform(())
with tf.Session() as sess:
  print(gen1.eval())
  print(gen2.eval())
  # 0.96046877
  # 0.85591054

As a curiosity, and to validate the fact that seeds are simply the last used id in the graph, you could align the seed of gen2 to gen1 with

import tensorflow as tf
tf.set_random_seed(1234)
gen1 = tf.random_uniform(())
# 4 operations seems to be created after seed has been picked
seed = tf.get_default_graph()._last_id - 4
gen2 = tf.random_uniform((), seed=seed)
with tf.Session() as sess:
  print(gen1.eval())
  print(gen2.eval())
  # 0.96046877
  # 0.96046877

Obviously though, this should not pass code review.

Thanks so much for your thorough answer. I really appreciate it. However in my work, I need to run multiple chains with different randoms seeds (although in the question I mentioned the same seeds. That was just to make my question look simpler). I also want it to be reproducible, that is why I wanted to set the global seed too. Does it make it clear why I am trying to set the seeds within the for loop? — Mehdi Rezaie, Jul 10 '18 at 05:37
`tf.random.set_seed(seed)` worked perfectly fine for me, didn't need to do anything with the session — citynorman, Oct 16 '20 at 15:20

score 14 · Answer 2 · answered Jun 03 '19 at 11:18

14

For Tensorflow 2.0 tf.random.set_random_seed(seed) changed to tf.random.set_seed(seed).

see TF docs:

answered Jun 03 '19 at 11:18

maxstrobel

457
4
14

This worked perfectly fine for me. Also there was no need for `np.random.seed(1)` as suggested in other writeups – citynorman Oct 16 '20 at 15:19

John Doe · Answer 3 · 2019-07-18T12:46:34.627

Late to the party, however the random number generator has been overhauled (see https://github.com/tensorflow/community/pull/38 for summarizing the process) and the tf.random.experimental.Generator class now provides the desired functionality.

From TF 1.14 onwards (incl. TF 2.0), you can seed the generator and obtain the exact same random number regardless of session, platform, or even architecture.

import tensorflow as tf

rng = tf.random.experimental.Generator.from_seed(1234)
rng.uniform((), 5, 10, tf.int64)  # draw a random scalar (0-D tensor) between 5 and 10

See the documentation for details:

https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/random/experimental/Generator (TF 2.0)
https://www.tensorflow.org/api_docs/python/tf/random/experimental/Generator (TF 1.14, as of now)

To address your particular question (I'm using TF 2.0):

for i in range(3):
  b = tf.random.uniform((10,), 0, 10, seed=1234)
  print(b)

gives

tf.Tensor(
[2.7339518  9.339194   5.2865124  8.912003   8.402512   0.53086996
 4.385383   4.8005686  2.2077608  2.1795273 ], shape=(10,), dtype=float32)
tf.Tensor(
[9.668942   3.4503186  7.4577675  2.9200733  1.8064988  6.1576104
 3.9958012  1.889689   3.8289428  0.36031008], shape=(10,), dtype=float32)
tf.Tensor(
[8.019657  4.895439  5.90925   2.418766  4.524292  7.901089  9.702316
 5.1606855 9.744821  2.4418736], shape=(10,), dtype=float32)

while this

for i in range(3):
  rng = tf.random.experimental.Generator.from_seed(1234)
  b = rng.uniform((10,), 0, 10)
  print(b)

gives what you want:

tf.Tensor(
[3.581475  1.132276  5.6670904 6.712369  3.2565057 1.7095459 8.468903
 6.2697005 1.0973608 2.7732193], shape=(10,), dtype=float32)
tf.Tensor(
[3.581475  1.132276  5.6670904 6.712369  3.2565057 1.7095459 8.468903
 6.2697005 1.0973608 2.7732193], shape=(10,), dtype=float32)
tf.Tensor(
[3.581475  1.132276  5.6670904 6.712369  3.2565057 1.7095459 8.468903
 6.2697005 1.0973608 2.7732193], shape=(10,), dtype=float32)

score 5 · Answer 4 · answered Jul 09 '18 at 18:48

There is a related GitHub issue. But in your case, please refer to the documentation of tf.set_random_seed:

Sets the graph-level random seed.

You probably want to use the same graph and same operation to get the same random numbers in different sessions.

import tensorflow as tf

tf.set_random_seed(1234)
generate = tf.random_uniform((10,), 0, 10)
tf.get_default_graph().finalize() # something everybody tends to forget

for i in range(3):
    with tf.Session() as sess:
        b = sess.run(generate)
        print(b)

gives

[9.604688  5.811516  6.4159    9.621765  0.5434954 4.1893444 5.8865128
 7.9785547 8.296125  8.388672 ]
[9.604688  5.811516  6.4159    9.621765  0.5434954 4.1893444 5.8865128
 7.9785547 8.296125  8.388672 ]
[9.604688  5.811516  6.4159    9.621765  0.5434954 4.1893444 5.8865128
 7.9785547 8.296125  8.388672 ]

In your case, you created different operations within the same graph.

Thanks a lot for your answer! Yes, this works just perfectly. However indeed, what I want to do in the next step, is to set the seeds *differently* within the for loop, in order to have different running chains with different seeds. Does it make sense? — Mehdi Rezaie, Jul 10 '18 at 05:30

score 1 · Answer 5 · answered Dec 15 '19 at 20:22

I noticed that you want to have 3 different vectors containing random numbers. Every time you want to run the code you want these three vectors containing random numbers to be the same as the first time. This approach is completely explainable, why need four the same random vectors. You want to have 4 random vectors each other.

There are two types of seeds that you can set when defining chart operations: Grain at the chart level, which is set by tf.set_random_seed, and seeds at the operational level, which are placed in the initializer variable As the grain is at the chart level, each time the result is different. You must use tf.InteractiveSession ()

tf.set_random_seed(1234)

sess = tf.InteractiveSession()
print(sess.run(tf.random_uniform((10,), 0, 10, seed=1)))
print(sess.run(tf.random_uniform((10,), 0, 10, seed=2)))
print(sess.run(tf.random_uniform((10,), 0, 10, seed=3)))
print(sess.run(tf.random_uniform((10,), 0, 10, seed=4)))

You get 4 random number vectors containing numbers from 0 to 10.

Aravindh Kuppusamy · Answer 6 · 2018-07-09T19:18:10.357

You are getting different results on different runs because there are three generate variables(operation) defined in the graph and not one. This is because you have the generate operation inside the for loop which leads to three operations.(Tensor("random_uniform:0"), Tensor("random_uniform_1:0"), Tensor("random_uniform_2:0")). Just do print(generate) inside the for loop. You will see three different operations as stated above.

tf.set_random_seed sets the seed at the graph level. So it deterministically picks the seed for each operation in the graph. So, the three generate operations are assigned the same three seeds at each run. And this is why for each run, you would be seeing the same results for all three variables correspondingly. Please take a look at this for more information on setting random seeds.

So, If you want to have the same results each time you run a session, you can do this:

tf.set_random_seed(1234)
generate = tf.random_uniform((10,), 0, 10)
for i in range(3):
    with tf.Session() as sess:
        b = sess.run(generate)
        print(b)

But why do you want to create n sessions. You should ideally be creating one session and then run the session n times. Creating a new session for each run is not required and each time it tries to place the variables and operations in the graph to the device(GPU or CPU).

Thanks for your answer. I really appreciate it. The reason I am trying to set the random seeds within the for loop, is because I want to change the seed for each loop (Sorry! my ultimate goal is to do a task for N times while the seeds are set differently and I want everything to be reproducible). So far I could make it reproducible, by using the seting the global seed and the operation seed within the loop, but I am not sure if that is the efficient way. — Mehdi Rezaie, Jul 10 '18 at 05:41

score 0 · Answer 7 · answered Jul 25 '20 at 08:13

Adding this answer for reference: The problem of the reproducible result might not come directly from TensorFlow but from the underlying platform. See this issue on Keras

In case of running on an Nvidia GPU, there is a lib from Nvidia that helps to get deterministic results: tensorflow-determinism

pip install tensorflow-determinism

and you use it like this:

import tensorflow as tf
import os
os.environ['TF_DETERMINISTIC_OPS'] = '1'

and it's still recommended to add these fields:

SEED = 123
os.environ['PYTHONHASHSEED']=str(SEED)
random.seed(SEED)
np.random.seed(SEED)
tf.random.set_seed(SEED)

For Tensorflow < 2.1, add above and this:

from tfdeterminism import patch
patch()

Reproducible results in Tensorflow with tf.set_random_seed

7 Answers7

Linked

Related