To learn more about deep learning and computer vision, I'm working on a project to perform lane-detection on roads. I'm using TFLearn as a wrapper around Tensorflow.
Background
The training inputs are images of roads (each image represented as a 50x50 pixel 2D array, with each element being a luminance value from 0.0 to 1.0).
The training outputs are the same shape (50x50 array), but represent the marked lane area. Essentially, non-road pixels are 0, and road pixels are 1.
This is not a fixed-size image classification problem, but instead a problem of detecting road vs. non-road pixels from a picture.
Problem
I've not been able to successfully shape my inputs/outputs in a way that TFLearn/Tensorflow accepts, and I'm not sure why. Here is my sample code:
# X = An array of training inputs (of shape (50 x 50)).
# Y = An array of training outputs (of shape (50 x 50)).
# "None" equals the number of samples in my training set, 50 represents
# the size of the 2D image array, and 1 represents the single channel
# (grayscale) of the image.
network = input_data(shape=[None, 50, 50, 1])
network = conv_2d(network, 50, 50, activation='relu')
# Does the 50 argument represent the output shape? Should this be 2500?
network = fully_connected(network, 50, activation='softmax')
network = regression(network, optimizer='adam', loss='categorical_crossentropy', learning_rate=0.001)
model = tflearn.DNN(network, tensorboard_verbose=1)
model.fit(X, Y, n_epoch=10, shuffle=True, validation_set=(X, Y), show_metric=True, batch_size=1)
The error I receive is on the model.fit
call, with error:
ValueError: Cannot feed value of shape (1, 50, 50) for Tensor u'InputData/X:0', which has shape '(?, 50, 50, 1)'
I've tried reducing the sample input/output arrays to a 1D vector (with length 2500), but that leads to other errors.
I'm a bit lost with how to shape all this, any help would be greatly appreciated!