What is the best method for object detection in low-resolution moving video?

Question

I'm looking for the fastest and more efficient method of detecting an object in a moving video. Things to note about this video: It is very grainy and low resolution, also both the background and foreground are moving simultaneously.

Note: I'm trying to detect a moving truck on a road in a moving video.

Methods I've tried:

Training a Haar Cascade - I've attempted training the classifiers to identify the object by taking copping multiple images of the desired object. This proved to produce either many false detects or no detects at all (the object desired was never detected). I used about 100 positive images and 4000 negatives.

SIFT and SURF Keypoints - When attempting to use either of these methods which is based on features, I discovered that the object I wanted to detect was too low in resolution, so there were not enough features to match to make an accurate detection. (Object desired was never detected)

Template Matching - This is probably the best method I've tried. It's the most accurate although the most hacky of them all. I can detect the object for one specific video using a template cropped from the video. However, there is no guaranteed accuracy because all that is known is the best match for each frame, no analysis is done on the percentage template matches the frame. Basically, it only works if the object is always in the video, otherwise it will create a false detect.

So those are the big 3 methods I've tried and all have failed. What would work best is something like template matching but with scale and rotation invariance (which led me to try SIFT/SURF), but i have no idea how to modify the template matching function.

Does anyone have any suggestions how to best accomplish this task?

How is the truck oriented? Does its shape/orientation change? Does the camera change position? Is this a one-off video, or a system that needs to work in many different conditions? — endolith, Dec 01 '09 at 17:11
I agree with endolith, it is crucial you define the problem with more details. The choice of method will affect the robustness. — Ivan, Dec 02 '09 at 14:32
The view of the truck by its side and it is moving horizontally. The shape of the vehicle does not change much, which is why the template matching works, but I still want my method to be robust. Basically the camera pans left and right, following a few different vehicles, with some other vehicles driving past in the background. Essentially, I want this to work in more situations than one (but mainly dealing with similar quality video). The least I want to accomplish is a detector of moving objects inside a moving video. — monky822, Dec 10 '09 at 00:58

score 5 · Answer 1 · answered Oct 19 '11 at 10:55

Apply optical flow to the image and then segment it based on flow field. Background flow is very different from "object" flow (which mainly diverges or converges depending on whether it is moving towards or away from you, with some lateral component also).

Here's an oldish project which worked this way:

http://users.fmrib.ox.ac.uk/~steve/asset/index.html

score 2 · Answer 2 · answered Dec 02 '09 at 14:58

This vehicle detection paper uses a Gabor filter bank for low level detection and then uses the response to create the features space where it trains an SVM classifier.

The technique seems to work well and is at least scale invariant. I am not sure about rotation though.

score 1 · Answer 3 · answered Dec 03 '09 at 00:51

Not knowing your application, my initial impression is normalized cross-correlation, especially since I remember seeing a purely optical cross-correlator that had vehicle-tracking as the example application. (Tracking a vehicle as it passes using only optical components and an image of the side of the vehicle - I wish I could find the link.) This is similar (if not identical) to "template matching", which you say kind of works, but this won't work if the images are rotated, as you know.

However, there's a related method based on log-polar coordinates that will work regardless of rotation, scale, shear, and translation.

I imagine this would also enable tracking that the object has left the scene of the video, too, since the maximum correlation will decrease.

score 0 · Answer 4 · answered Nov 27 '09 at 09:40

0

How low resolution are we talking? Could you also elaborate on the object? Is it a specific color? Does it have a pattern? The answers affect what you should be using.

Also, I might be reading your template matching statement wrong, but it sounds like you are overtraining it (by testing on the same video you extracted the object from??).

answered Nov 27 '09 at 09:40

UsAaR33

3,218
2
26
53

The resolution is 720x480, but the quality of the video is very poor. The video is very pixelated at this resolution. Regarding the template matching, i'm not training anything. I am just using a cropped object from the video and just searching for in from each frame. – monky822 Dec 08 '09 at 17:06
Well you are training it, just on one set of data. The template will match darn well if the lighting and orientation of the object hardly changes. The moment that does, accuracy will really drop off. Again though, use all the cues you can - esp. color if it is there. – UsAaR33 Dec 10 '09 at 12:34

score 0 · Answer 5 · edited Oct 18 '11 at 22:18

0

A Haar Cascade is going to require significant training data on your part, and will be poor for any adjustments in orientation.

Your best bet might be to combine template matching with an algorithm similar to camshift in opencv (5,7MB PDF), along with a probabilistic model (you'll have to figure this one out) of whether the truck is still in the image.

edited Oct 18 '11 at 22:18

NGLN

41,230
8
102
186

answered Dec 02 '09 at 04:11

BrainCore

4,650
3
28
35

What is the best method for object detection in low-resolution moving video?

5 Answers5