Issues with shaping Tensorflow/TFLearn inputs/outputs for images

Question

To learn more about deep learning and computer vision, I'm working on a project to perform lane-detection on roads. I'm using TFLearn as a wrapper around Tensorflow.

Background

The training inputs are images of roads (each image represented as a 50x50 pixel 2D array, with each element being a luminance value from 0.0 to 1.0).

The training outputs are the same shape (50x50 array), but represent the marked lane area. Essentially, non-road pixels are 0, and road pixels are 1.

This is not a fixed-size image classification problem, but instead a problem of detecting road vs. non-road pixels from a picture.

Problem

I've not been able to successfully shape my inputs/outputs in a way that TFLearn/Tensorflow accepts, and I'm not sure why. Here is my sample code:

# X = An array of training inputs (of shape (50 x 50)).
# Y = An array of training outputs (of shape (50 x 50)).

# "None" equals the number of samples in my training set, 50 represents
# the size of the 2D image array, and 1 represents the single channel
# (grayscale) of the image.
network = input_data(shape=[None, 50, 50, 1])

network = conv_2d(network, 50, 50, activation='relu')

# Does the 50 argument represent the output shape? Should this be 2500?
network = fully_connected(network, 50, activation='softmax')

network = regression(network, optimizer='adam', loss='categorical_crossentropy', learning_rate=0.001)

model = tflearn.DNN(network, tensorboard_verbose=1)

model.fit(X, Y, n_epoch=10, shuffle=True, validation_set=(X, Y), show_metric=True, batch_size=1)

The error I receive is on the model.fit call, with error:

ValueError: Cannot feed value of shape (1, 50, 50) for Tensor u'InputData/X:0', which has shape '(?, 50, 50, 1)'

I've tried reducing the sample input/output arrays to a 1D vector (with length 2500), but that leads to other errors.

I'm a bit lost with how to shape all this, any help would be greatly appreciated!

Sounds like a personnel problem – boztalay Oct 15 '16 at 21:39 — boztalay, Oct 15 '16 at 21:39

score 1 · Answer 1 · answered Oct 15 '16 at 21:59

Have a look at the imageflow wrapper for tensorflow, which converts a numpy array containing multiple images into a .tfrecords file, which is the suggested format for using tensorflow https://github.com/HamedMP/ImageFlow.

You have to install it using

$ pip install imageflow

Suppose your numpy array containing some 'k' images is k_images and the corresponding k labels (one-hot-encoded) are stored in k_labels, then creating a .tfrecords file with the name 'tfr_file.tfrecords' gets as simple as writing the line

imageflow.convert_images(k_images, k_labels, 'tfr_file')

Alternatively, Google's Inception model contains a code to read images in a folder assuming each folder represents one label https://github.com/tensorflow/models/blob/master/inception/inception/data/build_image_data.py

Thanks for the answer. I understand that both your links help to serialize images into the .tfrecords format, but I'm not sure how this helps (or fits in) with the larger problem of training the model. If I could get a basic layer infrastructure working first (with plain float arrays), then I could later use one of these frameworks, right? — Janum Trivedi, Oct 15 '16 at 22:17

score 1 · Answer 2 · answered Dec 19 '16 at 14:57

The error states that you have conflicting tensor shapes, one of size 4 and the other of size 3. This is due to the input data (X) not being of shape [-1,50,50,1]. All that is needed here is to reshape X to the correct shape before feeding into your network.

# X = An array of training inputs (of shape (50 x 50)).
# Y = An array of training outputs (of shape (50 x 50)).
# "None" equals the number of samples in my training set, 50 represents
# the size of the 2D image array, and 1 represents the single channel
# (grayscale) of the image.

X = tensorflow.reshape(X, shape[-1, 50, 50, 1])
network = input_data(shape=[None, 50, 50, 1])

network = conv_2d(network, 50, 50, activation='relu')

# Does the 50 argument represent the output shape? Should this be 2500?
network = fully_connected(network, 50, activation='softmax')

network = regression(network, optimizer='adam', loss='categorical_crossentropy', learning_rate=0.001)

model = tflearn.DNN(network, tensorboard_verbose=1)

model.fit(X, Y, n_epoch=10, shuffle=True, validation_set=(X, Y), show_metric=True, batch_size=1)

Issues with shaping Tensorflow/TFLearn inputs/outputs for images

2 Answers2