I am trying to detect people from a camera's feed using cv2.HOGDescriptor()
and using their default people classifier.
The recognizer kinda works but I honestly am having an issue with understanding what values to assign to winStride
, padding
,scale
and groupThreshold
respectively.
Currently, the camera feed's frame size is 1280 X 720 and I resize it to 400 X 400 then perform detectMultiScale
with parameters
hogParams = {'winStride': (8, 8), 'padding': (32, 32), 'scale': 1.05, 'finalThreshold': 2}
Based off of this answer, I understand what these parameters do and represent.
My question is, is there a way of mapping image size with these values? A mathematical equation? An estimation method? I am not necessarily asking for a concrete or even a method that gives all values, but something better than trial and error or magic numbers.
Most of the references and tutorials pretty much use magic numbers without giving a proposition of how they attained them.
PS: Here's a visual aid in case you're still not sure of my question