I was trying to use the object detection API of Tensorflow to train a model.
And I was using the sample config of faster rcnn resnet101 (https://github.com/tensorflow/models/blob/master/object_detection/samples/configs/faster_rcnn_resnet101_voc07.config).
The following code was part of the config file I didn't quite understand:
image_resizer {
keep_aspect_ratio_resizer {
min_dimension: 600
max_dimension: 1024
}
}
My questions were:
- What was the exact meaning of
min_dimension
andmax_dimension
? Did it mean the size of input image would be resized to 600x1024 or 1024x600? - If I had different size of image and maybe some of them are relatively larger than 600x1024 (or 1024x600), could/should I increase the value of
min_dimension
andmax_dimension
?
The reason why I had such question was from this post: TensorFlow Object Detection API Weird Behaviour
In this post, the author itself gave an answer to the question:
Then I decided to crop the input image and provide that as an input. Just to see if the results improve and it did!
It turns out that the dimensions of the input image were much larger than the 600 x 1024 that is accepted by the model. So, it was scaling down these images to 600 x 1024 which meant that the cigarette boxes were losing their details :)
It used the same config as I used. And I was not sure if I could change these parameters if they were default or recommended setting to this special model, faster_rcnn_resnet101.