0

I want to get names of slots from Momentum optimizer in tensorflow with using get_slot_names as it is explained here in tensorflow webpage. I am using following line in my code to get them:

slots=tf.train.MomentumOptimizer(learning_rate=1e-3, momentum=0.9,).get_slot_names()

I ran my graph and then when I print slots it return me only an empty list. Any help will be appreciated.

By the way, my network is working fine in minimizing the loss or for other things. Also, I tried it with other optimizers, but it have same problem.

I am using tf 1.3 in ubuntu 14. Thanks,

2 Answers2

0

Slot_names are optimizer specific. If you want to get slot for each trainable variables you can use get_slot method with the correct slot_name. Slot name created (by default) for momentum_optimizer is momentum. Below is a simple example to illustrate the points.

x_input = np.linspace(0, 30, 200)
y_input = 4 * x_input + 6
W = tf.Variable(0.0, name="weight")
b = tf.Variable(0.0, name="bias")
X = tf.placeholder(tf.float32, name='InputX')
Y = tf.placeholder(tf.float32, name='InputY')
Y_pred = X * W + b

loss = tf.reduce_mean(tf.square(Y_pred - Y))

# define the optimizer
optimizer = tf.train.MomentumOptimizer(learning_rate=0.001, momentum=.9)

# training op
train_op = optimizer.minimize(loss)

# print the optimizer slot name
print(optimizer.get_slot_names())

# Results:['momentum']

# print out slot created for each trainable variables using the slot_name from above result
for v in tf.trainable_variables():
    print(optimizer.get_slot(v, 'momentum'))

# Results: <tf.Variable 'weight/Momentum:0' shape=() dtype=float32_ref>
           <tf.Variable 'bias/Momentum:0' shape=() dtype=float32_ref>
Ishant Mrinal
  • 4,688
  • 2
  • 24
  • 44
  • Thanks for for response. So, do you mean it only can return the momentum? Since it is mentioned in [https://www.tensorflow.org/versions/r1.3/api_docs/python/tf/train/MomentumOptimizer] , I thought it can return the 'accumulation' variable in the momentum formulation too. – AliAsghar Mortazi Aug 28 '17 at 17:09
  • This is just a case of indonsistent naming in TF. "momentum" slot variables are actually "accumulation" from the formulation, while "momentum" from the formulation, is just a float, hyperparameter, stored in opt._momentum. – lejlot Aug 28 '17 at 17:43
  • @AliAsgharMortazi As I have mentioned in this answer, you can use `get_slot` to get the accumulated variables and values. See my example. – Ishant Mrinal Aug 28 '17 at 20:00
0

The only problem in the code is that slot_variables are created during minimize call (actually in apply_gradients). And since you call get_slot_variables() before - they are empty.

So instead of

slots=tf.train.MomentumOptimizer(learning_rate=1e-3, momentum=0.9,).get_slot_names() 

you should do

opt = tf.train.MomentumOptimizer(learning_rate=1e-3, momentum=0.9,)

train_op = opt.minimize(your_loss) # or anything else

slota = opt.get_slot_names() # this will work

The reason for that is really simple - many slots are variables specific, for example methods like Adam will create one slot per each optimised variable, and before calling .minimize - optimiser does not know which variables it will be optimising.

In particular, for MomentumOptimizer you keep accumulated gradients per each variable. Consequently they cannot be computed prior to calling minimize. These accumulated gradients are stored in "momentum" slot (quite bad choice for the name, but this where they are in TF).

lejlot
  • 56,903
  • 7
  • 117
  • 144
  • It worked, however it seems weird why such a thing should be done! Anyways, thanks and I don't know why you got negative vote. Also, do you know if (or how) I can change the "accumulation" value which is passed to the minimizer? Assume I want to add the accumulation with a number (alpha) and then update the parameters – AliAsghar Mortazi Aug 28 '17 at 20:13
  • Well this is one of many hard design decisions TF people had to make. In terms of modifing the optimizer - there is no way of achieving that without making your own optimiser. On the plus side - this should not be that hard, just copy MomentumOptimizer and adjust its apply_gradients routine to use your additional signal. This again comes from TF assumption that graph is append only - you cannot modify a constracted one, and since apply_gradients already create a graph of the whole update, you have to modify its logic to change it anyhow. – lejlot Aug 28 '17 at 20:52
  • I hoped there was a simpler solution! Anyway, thanks. – AliAsghar Mortazi Aug 28 '17 at 21:10