Image downscaling algorithm

Question

Could you help me find the right algorithm for image resizing? I have an image of a number. The maximum size is 200x200, I need to get an image with size 15x15 or even less. The image is monochrome (black and white) and the result should be the same. That's the info about my task.

I've already tried one algorithm, here it is

// xscale, yscale - decrease/increase rate
for (int f = 0; f<=49; f++)
            {
                    for (int g = 0; g<=49; g++)//49+1 - final size
                    {
                            xpos = (int)f * xscale;
                            ypos = (int)g * yscale;
                            picture3[f][g]=picture4[xpos][ypos];
                    }
            }

But it won't work with the decrease of an image, which is my prior target. Could you help me find an algorithm, which could solve that problem (quality mustn't be perfect, the speed doesn't even matter). Some information about it would be perfect too considering the fact I'm a newbie. Of course, a short piece of c/c++ code (or a library) will be perfect too.

Edit: I've found an algorithm. Will it be suitable for compressing from 200 to 20?

Downscaling to that size and producing a monochrome image will lead to disappointing results no matter how you do it. — Mark Ransom, Mar 05 '12 at 17:15
picture3 is supposed to be the new shrunken image? If so, you're going about the reversal backwards. it should be `picture3[xpos][ypos] = picture4[f][g]`. If picture4 is the reduced one, then your assignment is backwards - picture4 should be on the left. — Marc B, Mar 05 '12 at 17:16
@Mark Ransom, unfortunately my neural network is very slow. Is there really no chance of getting not great, but at least recognizable pictures? — user1131662, Mar 05 '12 at 17:22

Mark Ransom · Answer 1 · 2013-03-26T20:02:56.570

The general approach is to filter the input to generate a smaller size, and threshold to convert to monochrome. The easiest filter to implement is a simple average, and it often produces OK results. The Sinc filter is theoretically the best but it's impractical to implement and has ringing artifacts which are often undesirable. Many other filters are available, such as Lanczos or Tent (which is the generalized form of Bilinear).

Here's a version of an average filter combined with thresholding. Assuming picture4 is the input with pixel values of 0 or 1, and the output is picture3 in the same format. I also assumed that x is the least significant dimension which is opposite to the usual mathematical notation, and opposite to the coordinates in your question.

int thumbwidth = 15;
int thumbheight = 15;
double xscale = (thumbwidth+0.0) / width;
double yscale = (thumbheight+0.0) / height;
double threshold = 0.5 / (xscale * yscale);
double yend = 0.0;
for (int f = 0; f < thumbheight; f++) // y on output
{
    double ystart = yend;
    yend = (f + 1) / yscale;
    if (yend >= height) yend = height - 0.000001;
    double xend = 0.0;
    for (int g = 0; g < thumbwidth; g++) // x on output
    {
        double xstart = xend;
        xend = (g + 1) / xscale;
        if (xend >= width) xend = width - 0.000001;
        double sum = 0.0;
        for (int y = (int)ystart; y <= (int)yend; ++y)
        {
            double yportion = 1.0;
            if (y == (int)ystart) yportion -= ystart - y;
            if (y == (int)yend) yportion -= y+1 - yend;
            for (int x = (int)xstart; x <= (int)xend; ++x)
            {
                double xportion = 1.0;
                if (x == (int)xstart) xportion -= xstart - x;
                if (x == (int)xend) xportion -= x+1 - xend;
                sum += picture4[y][x] * yportion * xportion;
            }
        }
        picture3[f][g] = (sum > threshold) ? 1 : 0;
    }
}

I've now tested this code. Here's the input 200x200 image, followed by a nearest-neighbor reduction to 15x15 (done in Paint Shop Pro), followed by the results of this code. I'll leave you to decide which is more faithful to the original; the difference would be much more obvious if the original had some fine detail.

original nearest neighbor average+threshold

I do the treshold before interpolation. So it's not right? Maybe I should describe the input picture better for you to decide which filter will be better? I'll look up some info on these filters, but hope you'll help with an implementation of an Average filter. — user1131662, Mar 05 '12 at 18:15
@user1131662, your question described the input image as monochrome which led me to believe that thresholding before resizing wasn't possible. — Mark Ransom, Mar 05 '12 at 19:24
Sorry seems I need to edit my question. The whole story is that I grab a picture. Then I treshold it using a medium treshold algorithm. Now I have a two-dimensional array with numbers 0(for white background dots) and 1(for black ones - the number itself). Then I would like to resize the picture and then - to recognize. That's all. — user1131662, Mar 05 '12 at 20:02
Thanks, I'll test this code as soon as possible. On the first glimpse I see some questions which are necessary to understand. Nevertheless thank you for this one very much. — user1131662, Mar 07 '12 at 16:09

strcat · Answer 2 · 2012-03-05T19:00:51.960

Since you're fine with using a library, you could look into the imagemagick C++ bindings.

You could also output the image in a simple format like a pbm, and then call the imagemagick command to resize it:

system("convert input.pbm -resize 10x10 -compress none output.pbm");

Sample output file (note: you don't need to use a new line for each row):

P1
20 20
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0
0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0
0 0 0 0 0 0 0 1 1 0 0 0 0 1 1 0 0 0 0 0
0 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 0 0 0 0
0 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 0 0 0 0
0 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 0 0 0 0
0 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 0 0 0 0
0 0 0 0 0 0 0 1 1 0 0 0 0 1 1 1 0 0 0 0
0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 0
0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

The output file:

P1
10 10
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 1 1 0 1 1 0 
0 0 0 0 1 0 0 1 1 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0 1 1 0 1 1 0 0 0 0 0 0 1 1 1 1 
1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0

Could you give me some example of the command-line utility inside the source code please? — user1131662, Mar 05 '12 at 18:05
Nice, the result you've shown is okay for me. Hope the decrease from 200 won't be a huge difference. So I need to download this library. Then I include it in my project. Then I do the things you've shown, right? And could you tell me more about this strange file format? — user1131662, Mar 05 '12 at 20:11
It's the [netpbm format](https://en.wikipedia.org/wiki/Netpbm_format). It's just very simple and easy to deal with manually. If you use the imagemagick API, then you can just do it all in C++ without an intermediate image format (I haven't used it before though). — strcat, Mar 05 '12 at 20:22

score 3 · Answer 3 · answered Mar 06 '12 at 15:35

I've found an implementation of a bilinear interpolaton. C code.

Assuming that:

a - a primary array (which we need to stretch/compress) pointer.

oldw - primary width

oldh - primary height

b - a secondary array (which we get after compressing/stretching) pointer

neww - secondary width

newh - seconday height

#include <stdio.h>
#include <math.h>
#include <sys/types.h>

void resample(void *a, void *b, int oldw, int oldh, int neww,  int newh)
{
int i;
int j;
int l;
int c;
float t;
float u;
float tmp;
float d1, d2, d3, d4;
u_int p1, p2, p3, p4; /* nearby pixels */
u_char red, green, blue;

for (i = 0; i < newh; i++) {
    for (j = 0; j < neww; j++) {

        tmp = (float) (i) / (float) (newh - 1) * (oldh - 1);
        l = (int) floor(tmp);
        if (l < 0) {
            l = 0;
        } else {
            if (l >= oldh - 1) {
                l = oldh - 2;
            }
        }

        u = tmp - l;
        tmp = (float) (j) / (float) (neww - 1) * (oldw - 1);
        c = (int) floor(tmp);
        if (c < 0) {
            c = 0;
        } else {
            if (c >= oldw - 1) {
                c = oldw - 2;
            }
        }
        t = tmp - c;

        /* coefficients */
        d1 = (1 - t) * (1 - u);
        d2 = t * (1 - u);
        d3 = t * u;
        d4 = (1 - t) * u;

        /* nearby pixels: a[i][j] */
        p1 = *((u_int*)a + (l * oldw) + c);
        p2 = *((u_int*)a + (l * oldw) + c + 1);
        p3 = *((u_int*)a + ((l + 1)* oldw) + c + 1);
        p4 = *((u_int*)a + ((l + 1)* oldw) + c);

        /* color components */
        blue = (u_char)p1 * d1 + (u_char)p2 * d2 + (u_char)p3 * d3 + (u_char)p4 * d4;
        green = (u_char)(p1 >> 8) * d1 + (u_char)(p2 >> 8) * d2 + (u_char)(p3 >> 8) * d3 + (u_char)(p4 >> 8) * d4;
        red = (u_char)(p1 >> 16) * d1 + (u_char)(p2 >> 16) * d2 + (u_char)(p3 >> 16) * d3 + (u_char)(p4 >> 16) * d4;

        /* new pixel R G B  */
        *((u_int*)b + (i * neww) + j) = (red << 16) | (green << 8) | (blue);       
    }
}
}

Hope it will be useful for other users. But nevertheless I still doubth whether it will work in my situation (when not stratching, but compressing an array). Any ideas?

Interpolation (bilinear or any other) is the wrong answer. It might work for minor size adjustments, but at the scale you're talking about 200:15 it won't be any better than just tossing out the extra pixels. — Mark Ransom, Mar 06 '12 at 21:07
I second that. I'm working with images all the time, and I can tell you - there's nothing as bad as linear interpolation on images. Don't. — mousomer, Dec 14 '17 at 10:07

score 2 · Answer 4 · answered Mar 05 '12 at 17:16

2

I think, you need Interpolation. There are a lot of algorithms, for example you can use Bilinear interpolation

answered Mar 05 '12 at 17:16

hired777

465
3
9

10

Interpolation would be used when _increasing_ the size of the image, because you need to create extra pixels between pixels of known colors. To get a smaller image the process is called 'filtering'. One way to do it is compute which pixels in the original image are covered by a pixel in the new image, and set the new pixel's value to be the average of the covered pixels. – bames53 Mar 05 '12 at 17:27

score 2 · Answer 5 · answered Mar 05 '12 at 17:18

2

To properly downscale an image, you should divide your image up into square blocks of pixels and then use something like Bilinear Interpolation in order to find the right color of the pixel that should replace the NxN block of pixels you're doing the interpolation on.

Since I'm not so good at the math involved, I'm not going to try give you an example of how the code would like. Sorry :(

answered Mar 05 '12 at 17:18

Tony The Lion

57,181
57
223
390

Oh, again this Wiki link. Unfortunately my math is not great too. So I'd be very grateful for "non-mathematical" explanation. – user1131662 Mar 05 '12 at 17:33
@user1131662 Ignore the interpolation; just do the first part with the square blocks of pixels. – anatolyg Mar 05 '12 at 17:50
So is it suitable for compressing, not stretching? – user1131662 Mar 05 '12 at 18:56

score 2 · Answer 6 · answered Mar 05 '12 at 18:02

If you use Win32, then StretchBlt function possibly help.

The StretchBlt function copies a bitmap from a source rectangle into a destination rectangle, stretching or compressing the bitmap to fit the dimensions of the destination rectangle, if necessary. The system stretches or compresses the bitmap according to the stretching mode currently set in the destination device context.

score 0 · Answer 7 · answered Mar 05 '12 at 17:16

0

One approach to downsizing a 200x200 image to, say 100x100, would be to take every 2nd pixel along each row and column. I'll leave you to roll your own code for downsizing to a size which is not a divisor of the original size. And I provide no warranty as to the suitability of this approach for your problem.

answered Mar 05 '12 at 17:16

High Performance Mark

74,067
7
97
147

This approach will lead to significant aliasing distortion. It's also the equivalent to the code in the question. – Mark Ransom Mar 05 '12 at 17:26
1

You can delete it if you want to. – Mark Ransom Mar 05 '12 at 18:04
It's actually better that linear interpolation. It's the fastest route and isn't as bad as most people think. Yes, you're better off with fancy algorithms like Lanczos. But simple subsampling is OK. – mousomer Dec 14 '17 at 10:10

Image downscaling algorithm

7 Answers7

Linked