4

I want to detect image contents ,what i need to do is find if image is of shirt or pant .

Img 1 enter image description here

Img 2 enter image description here

If i provide image of t shirt then based on comparing the shape i need the result that the given image is of t shirt

What i tried is Harr cascading , but it was not giving correct output .and for this sample size requried is too large

usernan
  • 562
  • 6
  • 21

4 Answers4

2

the Thing you are looking for is this. it will help you to sovle the problem

https://github.com/bikz05/bag-of-words

user3250373
  • 173
  • 1
  • 11
1

Assuming you only want to separate images which contain only object of interest, you can use BOW (bag of visual words) where an image is presented as set of features which are then classified with SVM or any other classifier.

You can also use feature detector + descriptor + classifier (e.g. SURF + SVM). Although there are more robust and faster feature detectors nowadays...

To avoid training process you can even try to use template matching (per outline). One such algorithm is provided on: Fast template matching - Codeproject

Haar cascade is used for object detection in images which contain other content as well, as it emloys sliding window detection + it contains stages which are trade-off between performance and robustness - they are fast, but some objects may be missed, because each stage is prone to miss-classification error which accumulates.

dajuric
  • 2,009
  • 1
  • 16
  • 33
  • 1
    Hi Dajuric : template matching will do here ? bez it works only for same image no ?means check here http://docs.opencv.org/2.4/doc/tutorials/imgproc/histograms/template_matching/template_matching.html – usernan Apr 29 '16 at 11:32
  • 1
    Can u provide me any examples for SURF + SVM – usernan Apr 29 '16 at 11:33
  • 1
    It is a different type of template matching where you can make multiple instance templates. Please see the article (the link). – dajuric Apr 30 '16 at 01:19
1

If your images contain the object already segmented, as your examples show, you can create a binary image where you indicate object vs background pixels.

After that, assuming the objects are not generally rotated or twisted, you can use simple features to make the classification. For example for the case above, just count the percentage of scan lines where there are 2 runs of foreground pixels. For a shirt this should be a low value, and for pants it should be high.

Obviously, if the given example images are not representative of the problem you're actually trying to solve, this wouldn't work.

EDIT: Some example matlab code:

function ratio=TwoRunFeature(I)
    g=rgb2gray(I);
    b=imdilate(g<255,ones(5));
    d=abs(imfilter(b,[-1 1]));
    runs=sum(d,2);
    ratio=sum(runs==2) / sum(runs==1);
end

function TestImage(name)
    I=imread(name);
    fprintf('%s: %f\n',name,TwoRunFeature(I));
end

TestImage('pants.jpg');
TestImage('shirt.jpg');

Prints:

pants.jpg: 1.947977
shirt.jpg: 0.068627

Pants will give high numbers and shirts low. Just threshold anywhere you want and you're done.

Photon
  • 2,999
  • 1
  • 12
  • 14
1

I will assume that these two images are from the database you have. From my experience, applying features ( local descriptors) on such images will create a kind of artificial features because of the segmentation or setting the background to uni color. The second important point in your case that these images may have different colors or textures and most of the detected features will come from the regions within the object. These regions are not important and have nothing to do with the classification. But having the segmented image should make the problem much easier. The best and easiest solution for your case: 1. Convert the image to gray scale then to binary by thresholding. 2. Inverse the image, so the background is black and the object is white. 3. Fill the holes: if any part of the image within the object is white this will result in holes. 4. Now detect only the boundary. By I = dilated IBinary - IBinary 5. Sample the boundary: Select only one non-zero pixel within each window of size 4x4 6. The next step is to use Shape context descriptor : to describe your image. 7. By using bag of visual words or sparse coding , re-represent the image. 8. Max pooling to get a rich representation. 9- SVM

Bashar Haddad
  • 369
  • 2
  • 10