12

I'm working in a project. A part of project consist to integrate the HOG people detector of OpenCV with a camera streaming .

Currently It's working the camera and the basic HOG detector (CPP detectMultiScale -> http://docs.opencv.org/modules/gpu/doc/object_detection.html). But don't work very well... The detections are very noising and the algorithm isn't very accuracy...

Why?

My camera image is 640 x 480 pixels.

The snippet code I'm using is:

std::vector<cv::Rect> found, found_filtered;
cv::HOGDescriptor hog;
hog.setSVMDetector(cv::HOGDescriptor::getDefaultPeopleDetector());
hog.detectMultiScale(image, found, 0, cv::Size(8,8), cv::Size(32,32), 1.05, 2);

Why don't work properly? What need for improve the accuracy? Is necessary some image size particular?

PS: Do you know some precise people detection algorithm, faster and developed in cpp ??

Andrey Rubshtein
  • 20,267
  • 10
  • 64
  • 102
Ricardo
  • 280
  • 1
  • 3
  • 12
  • people in the image must have at least the size of the HOG descriptor (a little less though) and only bigger persons will be found by detectMultiScale in addition (afaik). You can't expect near 100% rate for hog detection of such general "objects" but in a stream you should detect each real person at least once. you can try to track those and verify/deny your previously detected persons. – Micka Oct 28 '14 at 11:49
  • Thanks you for your answer Micka! Do you say: " at least the size of the HOG descriptor"... but **What is this size?** Do you know others posibilities or only exists "getDefaultPeopleDetector"?? Thanks you very much!! – Ricardo Oct 28 '14 at 13:08
  • This website might help you: http://www.geocities.ws/talh_davidc/ – SomethingSomething Jan 07 '16 at 19:40

1 Answers1

33

The size of the default people detector is 64x128, that mean that the people you would want to detect have to be atleast 64x128. For your camera resolution that would mean that a person would have to take up quite some space before getting properly detected.

Depending on your specific situation, you could try your hand at training your own HOG Descriptor, with a smaller size. You could take a look at this answer and the referenced library if you want to train your own HOG Descriptor.

For the Parameters:

win_stride: Given your input image has a size of 640 x 480, and the defaultpeopleDetector has a window size of 64x128, you can fit the HOG Detection window ( the 64x128 window) multiple times in the input image. The winstride tells HOG to move the detection window a certain amount each time. How does this work: Hog places the detection window on the top left of your input image. and moves the detection window each time by the win_stride.

Like this (small win_stride): enter image description here

or like this (large win_stride) enter image description here

A smaller winstride should improve accuracy, but decreases preformance, and the other way around

padding Padding adds a certain amount of extra pixels on each side of the input image. That way the detection window is placed a bit outside the input image. It's because of that padding that HOG can detect people who are very close to the edge of the input image.

group_threshold The group_treshold determines a value by when detected parts should be placed in a group. Low value provides no result grouping, a higher value provides result grouping if the amount of treshold has been found inside the detection windows. (in my own experience, I have never needed to change the default value)

I hope this makes a bit of sense for you. I've been working with HOG for the past few weeks, and read alot of papers, but I lost some of the references, so I can't link you the pages where this info comes from, I'm sorry.

hjpotter92
  • 71,576
  • 32
  • 131
  • 164
Timmynator0
  • 488
  • 4
  • 9
  • 1
    You can resize (increase) your image to find smaller persons, too. – Micka Oct 28 '14 at 21:10
  • Thanks you very much for your answer Timmynator0! How do you know it? Do you know some documentation about this algorithm and its parameters? Now I don't interested in training my own HOG descriptor. Thanks you very much! :) – Ricardo Oct 29 '14 at 09:35
  • @Ricardo I updated my answer to reflect your new question regarding the parameters. If this was helpfull, please mark my answer as the correct answer. – Timmynator0 Oct 29 '14 at 11:47
  • @Micka isn't it that increase in image size would mean a worse performace? What would be a solution to capture smaller persons without loosing performance and avoiding training model? Are there any parameters we can tune or smth similiar? – BC1554 Apr 17 '20 at 18:24
  • @BC1554: The pretrained HoG window detects persons of a minimum size of about 100 pixels height. This has to do with the trained HoG features and their window size. Afaik this cant be tuned by parameters. If you habe persons of let's say height 56 pixels, they are still visible for human eye well but cant be detected because of the trained features. If you increase the size by factor 2, the quality is still ok and the HoG window fits well so thst detection works. – Micka Apr 17 '20 at 19:24
  • @Micka yes, I can increase the size of picture by factor 2 and I would capture a person which previously was 56 pixels height, but my performance would slow down and I don’t want it. I let hog algorhytm on video and I want to keep quality. That is the thing! – BC1554 Apr 19 '20 at 05:11
  • the pretrained hog person svm classifier has learned how a person centered in a 128x64 window looks like. If you want something else you need to train something else. Detecting bigger persons is solved in opencv by downscaling the image iteratively. Detecting smaller persons can be solved by manually upscaling the image (which is the tiny undocumented trick, visible from understanding the hog window mechanism). That's the thing. – Micka Apr 19 '20 at 06:12