This seems like a trivial question, but I've been unable to find the answer.
I have batched sequences of images of shape:
[batch_size, number_of_frames, frame_height, frame_width, number_of_channels]
and I would like to pass each frame through a few convolutional and pooling layers. However, TensorFlow's conv2d
layer accepts 4D inputs of shape:
[batch_size, frame_height, frame_width, number_of_channels]
My first attempt was to use tf.map_fn
over axis=1, but I discovered that this function does not propagate gradients.
My second attempt was to use tf.unstack
over the first dimension and then use tf.while_loop
. However, my batch_size
and number_of_frames
are dynamically determined (i.e. both are None
), and tf.unstack
raises {ValueError} Cannot infer num from shape (?, ?, 30, 30, 3)
if num
is unspecified. I tried specifying num=tf.shape(self.observations)[1]
, but this raises {TypeError} Expected int for argument 'num' not <tf.Tensor 'A2C/infer/strided_slice:0' shape=() dtype=int32>.