How to do object detection using CNN's features in tensorflow?

Question

I am trying to make an end to end unified model that detects(localizes) the object in an image. The object itself can be of many types, like "text in the wild", but the surrounding features of the object should determine where the region of interest is.

Like detecting a human face, without considering the features of the face itself. i.e its some rage distance about the neck.

I'm expecting the output to be coordinates of the object, or like the image-net format to generate bounding boxes like : [xmin , ymin , xmax, ymax] I have a data-set of 500 images. Are there any examples of object detection in tensorflow based on surrounding features. i.e the feature maps from conv1 or conv2. ?

score 4 · Answer 1 · answered Jun 22 '16 at 22:35

4

There is Tensorflow based framework for object detection/localization that you can check out: https://github.com/Russell91/TensorBox

Though, I am not sure that 500 images would be enough to successfully retrain provided model(s).

answered Jun 22 '16 at 22:35

Alex

469
4
9

score 2 · Answer 2 · answered Mar 24 '18 at 15:44

Object detection using deep learning is broadly classified in to one-stage detectors (Yolo,SSD) and two stage detectors like Faster RCNN. Google's repo[1] contains pre-trained models for various detection architectures.

You could pick up a pre-trained model and then train it on your dataset. The two-stage model is modular and you have a choice of different feature extractors depending on whether speed/accuracy is crucial for you.

[1] Google's object detection repository

How to do object detection using CNN's features in tensorflow?

2 Answers2