How can you find the cuboid with the greatest volume in a heightmap? (with low complexity)

Question

I need to find the cuboid with the greatest volume, contained within a 2D-heightmap. The heightmap is an array of size w*d where w is width, h is height and d is depth. In C, this would look along the lines of:

unsigned heightmap[w][d]; // all values are <= h

I already know that there is a naive algorithm which can solve this with O(w*d*h) complexity. However, I suspect that there is a more optimal method out there. It works as follows, in pythonic pseudocode:

resultRectangle = None
resultHeight = None
resultVolume = -1

# iterate over all heights
for loopHeight in range(0, h):
    # create a 2D bitmap from our heightmap where a 1 represents a height >= loopHeight
    bool bitmap[w][d]
    for x in range(0, w):
        for y in range(0, d):
            bitmap[x][y] = heightmap[x][y] >= loopHeight

    # obtain the greatest-volume cuboid at this particular height
    maxRectangle = maxRectangleInBitmap(bitmap)
    volume = maxRectangle.area() * loopHeight

    # compare it to our current maximum and replace it if we found a greater cuboid
    if volume > resultVolume:
        resultHeight = loopHeight
        resultVolume = volume
        resultRectangle = maxRectangle

resultCuboid = resultRectangle.withHeight(resultHeight)

Finding the greatest area of all 1 in a rectangle is a known problem with O(1) complexity per pixel or O(w*d) in our case. The total complexity of the naive approach is thus O(w*h*d).

So as I already stated, I was wondering if we can beat this complexity. Perhaps we can get it down to O(w*d * log(h)) by searching through heights more intelligently instead of "brute-forcing" all of them.

The answer to this question Find largest cuboid containing only 1's in an NxNxN binary array by Evgeny Kluev seems to take a similar approach, but it falsely(?) assumes that the volumes which we would find at these heights form a unimodal function. If this was the case, we could use Golden Section Search to choose heights more intelligently, but I don't think we can.

If the height is really high, you could iterate on `sqrt(h)`, which would give you (i) a lower bound and upper bound on the solution and (ii) a subset of height sections (each of size `sqrt(h)`) to check 'naively'. In the worst case, the complexity remains the same, but in practice, that would surely run slightly faster. — m.raynal, Feb 03 '20 at 15:45
@m.raynal I'm not entirely sure what you mean. Do you mean omitting the data above `sqrt(h)`? That may not result in the optimal solution. The best solution can be found at any given height, unless it follows a pattern that I am not yet aware of. It's possible that the heightmap is completely flat and has a height of 1, making this plane at `h=1` the optimal solution. But it might also be a 1x1 column, making the maximum volume the maximum height. So if there is a way to omit heights, it can't just be by skipping heights arbitrarily. — Jan Schultke, Feb 03 '20 at 15:52
What I mean is that the biggest cuboid found between 2 heights `h1 < h2` is for sure smaller than `w1*d1*h2` where `w1*d1` represents the biggest area of `1` at the height `h1`, and for sure at least as large as `w1*d1*h1`. That allows you to bound the optimal solution and to discard some potential heights (even if in the worst case you'll end up iterating on all possible heights). What I was meaning is that instead of iterating on all possible values for `h`, you can iterate as follows: `for h=0; h<=H; h += sqrt(H)` — m.raynal, Feb 03 '20 at 16:01
@m.raynal you're right about the upper bound, that would allow you to cancel early when iterating from top to bottom. However, the lower bound is not `w1*d1*h1`, because the bitmap is constructed from all columns which have a height `>= loopHeight`. Fewer columns meet a higher requirement, so it's possible to find a smaller volume when going up in height. Consider a shape with a very wide base and a 1x1 column that extends from it. When increasing the height, none of the base might meet the requirement and the greatest cuboid is just a part of this column, hence the volume is low. — Jan Schultke, Feb 03 '20 at 16:08
I don't follow. Could you please help me understand why not just iterate over the actual heights given in the height map in `O(n^2)` (i.e., iterate over `w*d`) and for each, check if `w*d*map[w][d]` is greater than a global maximum? — גלעד ברקן, Feb 03 '20 at 19:15
@גלעדברקן the cuboid might not actually be filled then. Consider a heightmap with only a single `1*1` column of height `10`, located at `(2, 3)`. By your logic, you just found a cuboid with a volume of `2*3*10`, but this is false, it's just a single column with a volume of `1*1*10`. — Jan Schultke, Feb 03 '20 at 19:19
This is like a 3d version of largest rectangle in histogram, which can be solved in O(n). So I wonder if there could be an O(n^2) solution for this problem. — maraca, Feb 03 '20 at 19:56
@maraca I seriously doubt it. Finding the largest rectangle of `1` in a 2D-bitarray already takes `O(n²)`. I would be surprised if extending this problem into another dimension would come at no cost. The best which I hope for is `O(n² * log n)`. — Jan Schultke, Feb 03 '20 at 20:41
I think you can solve in O(min(w, d) * w * d) which could be much better than O(w * d * h) because h can have w * d different values, so O(w * d * h) would actually be O(w^2 * d^2). — maraca, Feb 05 '20 at 12:05

maniek · Answer 1 · 2020-02-10T16:00:33.913

0

Here is an idea, with a significant assumption. pseudo-code:

P <- points from heightmap sorted by increasing height.
R <- set of rectangles. All maximal empty sub-rectangles for the current height.
R.add(Rectangle(0,0,W,H)
result = last_point_in(P).height()
foreach(p in P):
   RR <- rectangles from R that overlap P (can be found in O(size(RR)), possibly with some logarithmic factors)
   R = R - RR
   foreach(r in RR)
       result = max(result, r.area() * p.height())
       split up r, adding O(1) new rectangles to R.
return result

The assumption, which I have a gut feeling about, but can't prove, is that RR will be O(1) size on average.

Edit: to clarify the "splittting", if we split at point p:

AAAAADFFF
AAAAADFFF
AAAAADFFF
BBBBBpGGG
CCCCCEHHH
CCCCCEHHH

We generate new rectangles consisting of: ABC, CEH, FGH, ADF, and add them to R.

edited Feb 10 '20 at 16:00

answered Feb 10 '20 at 12:17

maniek

6,557
2
18
40

There are some things I don't understand. Is `R` already initialized with the sub-rectangles or does it start empty? Also what exactly is the size of `RR`, how do you find those rectangles? And how is the complexity lower in this case? You need to sort all `w*d` points so that's `O(w*d*log(w*d))`, which is good. What does `R = R - RR` mean? What do you mean by "split up r", along which axis? And the result is supposed to be a set of coordinates, not just the height, so how do you deal with that? – Jan Schultke Feb 10 '20 at 14:34
Starts empty. R = R-RR is the same as R.removeAll(RR). To "split up", imagine you have an empty rectangle, r, and a tall point, p. now, you have at most 4 empty (overlapping) rectangles tha you need to add to R. – maniek Feb 10 '20 at 15:53
Basically you loop over the height like in your code, but you dynamically generate something like your "bitmap", which is contained in R (R would have to be something like a quadtree efficiently find "rectangles from R that overlap P"). – maniek Feb 10 '20 at 15:56
If you're splitting rectangles in a quadtree pattern, how does this account for those which don't align with the tree-shape? For example, a rectangle which contains a little of `A`, `F`, `C` and `H` could be the base of the greatest cuboid. And if you're iterating over all heights, the complexity would still be `O(w*d*h)` which doesn't appear better than the naive algorithm. Or am I seeing this wrong? – Jan Schultke Feb 10 '20 at 16:19

score 0 · Answer 2 · answered Feb 11 '20 at 15:05

OK, another take. Most "meat" is in the go function. It uses the same "splitting" concept as in my other answer, but uses top-down dynamic programming with memoization. rmq2d implements 2D Range Minimum Query. for size 1000x1000 it takes about 30 seconds (while using 3GB of memory).

#include <iostream>
#include <vector>
#include <cassert>
#include <set>
#include <tuple>
#include <memory.h>
#include <limits.h>

using namespace std;

constexpr int ilog2(int x){
    return 31 - __builtin_clz(x);
}

const int MAX_DIM = 100;




template<class T>
struct rmq2d{
    struct point{
        int x,y;

        point():x(0),y(0){}
        point(int x,int y):x(x),y(y){}
    };
    typedef point array_t[MAX_DIM][ilog2(MAX_DIM)+1][MAX_DIM];

    int h, logh;
    int w, logw;
    vector<vector<T>> v;

    array_t *A;

    rmq2d(){A=nullptr;}

    rmq2d &operator=(const rmq2d &other){
        assert(sizeof(point)==8);
        if(this == &other) return *this;
        if(!A){
            A = new array_t[ilog2(MAX_DIM)+1];
        }
        v=other.v;
        h=other.h;
        logh = other.logh;
        w=other.w;
        logw=other.logw;
        memcpy(A, other.A, (ilog2(MAX_DIM)+1)*sizeof(array_t));
        return *this;
    }

    rmq2d(const rmq2d &other){
        A = nullptr;
        *this = other;
    }

    ~rmq2d(){
        delete[] A;
    }


    T query(point pos){
        return v[pos.y][pos.x];
    }

    rmq2d(vector<vector<T>> &v) : v(v){
        A = new array_t[ilog2(MAX_DIM)+1];
        h = (int)v.size();
        logh = ilog2(h) + 1;
        w = (int)v[0].size();
        logw = ilog2(w) + 1;

        for(int y=0; y<h; ++y){
            for(int x=0;x<w;x++) A[0][y][0][x] = {x, y};

            for(int jx=1; jx<logw; jx++){
                int sz = 1<<(jx-1);
                for(int x=0; x+sz < w; x++){
                    point i1 = A[0][y][jx-1][x];
                    point i2 = A[0][y][jx-1][x+sz];
                    if(query(i1) < query(i2)){
                        A[0][y][jx][x] = i1;
                    }else{
                        A[0][y][jx][x] = i2;
                    }
                }
            }
        }
        for(int jy=1; jy<logh; ++jy){
            int sz = 1<<(jy-1);
            for(int y=0; y+sz<h; ++y){
                for(int jx=0; jx<logw; ++jx){
                    for(int x=0; x<w; ++x){
                        point i1 = A[jy-1][y][jx][x];
                        point i2 = A[jy-1][y+sz][jx][x];
                        if(query(i1) < query(i2)){
                            A[jy][y][jx][x] = i1;
                        }else{
                            A[jy][y][jx][x] = i2;
                        }

                    }
                }
            }
        }
    }

    point pos_q(int x1, int x2, int y1, int y2){
        assert(A);
        int lenx = ilog2(x2 - x1);
        int leny = ilog2(y2 - y1);

        point idxs[] = {
                 A[leny][y1][lenx][x1],
                 A[leny][y2-(1<<leny)][lenx][x1],
                 A[leny][y1][lenx][x2-(1<<lenx)],
                 A[leny][y2-(1<<leny)][lenx][x2-(1<<lenx)]
                };
        point ret = idxs[0];
        for(int i=1; i<4; ++i){
            if(query(ret) > query(idxs[i])) ret = idxs[i];
        }
        return ret;

    }

    T val_q(int x1, int x2, int y1, int y2){
        point pos = pos_q(x1,x2,y1,y2);
        return v[pos.y][pos.x];
    }
};

rmq2d<long long> rmq;

set<tuple<int, int, int ,int>> cac;
vector<vector<long long>> v(MAX_DIM-5,vector<long long>(MAX_DIM-5,0));

long long ret = 0;
int nq = 0;

void go(int x1, int x2, int y1, int y2){
    if(x1 >= x2 || y1>=y2) return;

    if(!cac.insert(make_tuple(x1,y1,x2,y2)).second) return;
    ++nq;

    auto p = rmq.pos_q(x1, x2, y1, y2);
    long long cur = v[p.y][p.x]*(x2-x1)*(y2-y1);
    if(cur > ret){
        cout << x1 << "-" << x2 << ", " << y1 << "-" << y2 << " h=" << v[p.y][p.x] <<  " :" << cur << endl;
        ret = cur;
    }

    go(p.x+1, x2, y1, y2);
    go(x1, p.x, y1, y2);
    go(x1, x2, p.y+1, y2);
    go(x1, x2, y1, p.y);
}


int main(){
    int W = (int)v[0].size();
    int H=(int)v.size();
    for(int y=0; y<H;++y){
        for(int x=0; x<W; ++x){
            v[y][x] = rand()%10000;
        }
    }
    rmq = rmq2d<long long>(v);
    go(0,W, 0, H);
    cout << "nq:" << nq << endl;
}

To be honest it's a little hard to interpret it all, partially because the variables are abbreviations in a lot of cases. So `ret` is your return value which is the volume. When you set `long long cur = v[p.y][p.x]*(x2-x1)*(y2-y1);`, this assumes that the current rectangle and the height of the point enclose a cuboid which is filled. But how does the algorithm ensure that all the points in this rectangle have at least the height of `v[p.y][p.x]`? Also, you're still splitting everything once along the x-axis and once along the y-axis in `go()`, so how are all possible rectangles queried? — Jan Schultke, Feb 12 '20 at 07:52
To clarify the previous comment. When you query a point `p` in a heightmap for a height `h` and pick a width `w` and depth `d`; the resulting cuboid might have a volume of `w*h*d`. However, this is only the case if every single point `p + (i, j)` where `i < w, j < d` has a height of `h_ij >= h`. Otherwise it's not a completely filled cuboid. And I can't see how the algorithm accounts for that. — Jan Schultke, Feb 12 '20 at 07:58
@J.Schultke that's the magic of range minimum query - the rmq variable. `pos_q` finds, in constant time, the minimum value within the rectangle. — maniek, Feb 13 '20 at 09:23
Your algorithm doesn't work for me. One significant problem is that `lenx` is calculated with exclusive distance, not inclusive distance. It should be `x2 - x1 + 1`, because two points at the same location still make up a `1x1x1` cuboid and thus have volume. I have fixed this issue, but it leads to underflow problems. When compared to an implementation of a stupid algorithm, it is obviously much faster but also doesn't find the optimal results:. Here is my code, maybe the modifications I have made caused some problems: https://gist.github.com/Eisenwave/55b73bdac52f88135a12a468c6447569 — Jan Schultke, Feb 14 '20 at 14:50
I also have concerns regarding the quadtree-like pattern in which you split the rectangle. First of all, what is the meaning of splitting at the lowest point? Since the heightmap is all completely random noise, splitting at that point seems as good as splitting at any other point. Also, when you find the lowest point, it will be the minimum in sub-rectangles. So the next half will also use the same point since it's still the minimum and the next half, etc., recursively. If your global, unique minimum is at `(99,99)` in a `100x100` map, your algorithm only does one query. — Jan Schultke, Feb 14 '20 at 15:12
No, you messed something up with your modifications. lenx is supposed to be calcualted the way it is, `x2` or `y2` is always _outside_ of a rectangle in my code. I just checked, my code agrees with a dumb brute-force solution. I'm not into debugging your changes. — maniek, Feb 16 '20 at 22:55

How can you find the cuboid with the greatest volume in a heightmap? (with low complexity)

2 Answers2