20

Given a binary matrix (values of 0 or 1), adjacent entries of 1 denote “hills”. Also, given some number k, find the minimum number of 0's you need to “flip” to 1 in order to form a hill of at least size k.

Edit: For clarification, adjacent means left-right-up-down neighborhoods. Diagonals do not count as adjacent. For example,

[0 1 0 1]

is one hill of size 2,

[0 1 1 0]

defines 2 hills of size 1,

[0 1 1 1]

defines 1 hill of size 3, and

[1 1 1 1]

defines 1 hill of size 4.

Also for clarification, size is defined by the area formed by the adjacent blob of 1's.

My initial solution has to do with transforming each existing hill into nodes of a graph, and the cost to be the minimal path to each other node. Then, performing a DFS (or similar algorithm) to find the minimum cost.

This fails in cases where choosing some path reduces the cost for another edge, and solutions to combat this (that I can think of) are too close to a brute force solution.

The Monkey
  • 262
  • 1
  • 9
  • 1
    How is adjacency defined? Left-right-above-below, or corners as well? – Helium_1s2 Jun 06 '18 at 14:54
  • 1
    So how do you define a hill? Does a hill of size 2 have two 1's that are vertically adjacent (like a spike)? You need to show us an example matrix that has some of those "hills". – Jim Mischel Jun 06 '18 at 14:58
  • 1
    Updated. Sorry for the ambiguity. – The Monkey Jun 06 '18 at 15:05
  • How many hills and what size/s is a `1` with one `1` above it and one `1` to its left (three `1`s altogether)? – גלעד ברקן Jun 06 '18 at 15:12
  • 1 hill of size 3 – The Monkey Jun 06 '18 at 15:14
  • 3
    Looks like https://en.wikipedia.org/wiki/Steiner_tree_problem modification. – DAle Jun 06 '18 at 16:01
  • Seems like a harder version of the traveling salesman problem (TSP). You have an initial distance between cities, and a value for each city. But in addition, you have the ability to build roads (affecting the distances), and the ability to build suburbs (increasing the value). Given that TSP is already hard, I don't see much hope for an efficient solution. But you *can* use the [branch and bound](https://en.wikipedia.org/wiki/Branch_and_bound) technique. To find an initial upper bound, find the largest hill, and subtract its size from `k`. That puts an upper limit on the number of flips. – user3386109 Jun 06 '18 at 16:16
  • Is there a maximum grid size, number of hills and size k, or are you looking for a general solution? – m69 ''snarky and unwelcoming'' Jun 16 '18 at 22:38
  • @m69 I am looking for a general solution. Suppose grid size, the number of hills and `k` are arbitrarily large enough. – The Monkey Jun 21 '18 at 18:25
  • shouldn't array be replaced by matrix in the title? – Walter Tross Jan 27 '20 at 15:16
  • The rectilinear Steiner tree problem seeks to find the minimal length lines to connect `N` points in the plane, but this problem also requires that the total length of the lines and points is at least `k`. Another issue with this problem vs a rectilinear Steiner tree is that in this problem, we are not required to connect all points. (@DAle) – גלעד ברקן Jan 29 '20 at 19:55
  • Compared to the rectilinear Steiner tree problem another complication is that the same hills can be connected in different ways for different values of k. Consider an L-shape and two points - coordinates (0,0), (0,1), (1,0), (2,0), (1,4), (3,2). As far I see 4->0, 7->2, 10->4, and the latter two start from different ends of the L – Hans Olsson Jan 31 '20 at 15:06

2 Answers2

3

Your problem is closely related to the rectilinear Steiner tree problem.

A Steiner tree connects a set of points together using line segments, minimising the total length of the line segments. The line segments can meet in arbitrary places, not necessarily at points in the set (so it is not the same thing as a minimum spanning tree). For example, given three points at the corners of an equilateral triangle, the Euclidean Steiner tree connects them by meeting in the middle:

Euclidean Steiner tree

A rectilinear Steiner tree is the same, except you minimise the total Manhattan distance instead of the total Euclidean distance.

In your problem, instead of joining your hills with line segments whose length is measured by Euclidean distance, you are joining your hills by adding pixels. The total number of 0s you need to flip to join two cells in your array is equal to the Manhattan distance between those two cells, minus 1.

The rectilinear Steiner tree problem is known to be NP-complete, even when restricted to points with integer coordinates. Your problem is a generalisation, except for two differences:

  • The "minus 1" part when measuring the Manhattan distance. I doubt that this subtle difference is enough to bring the problem into a lower complexity class, though I don't have a proof for you.
  • The coordinates of your integer points are bounded by the size of the matrix (as pointed out by Albert Hendriks in the comments). This does matter — it means that pseudo-polynomial time for the rectilinear Steiner tree problem would be polynomial time for your problem.

This means that your problem may or may not be NP-hard, depending on whether the rectilinear Steiner tree problem is weakly NP-complete or strongly NP-complete. I wasn't able to find a definitive answer to this in the literature, and there isn't much information about the problem other than in academic literature. It does at least appear that there isn't a known pseudo-polynomial time algorithm, as far as I can tell.

Given that, your most likely options are some kind of backtracking search for an exact solution, or applying a heuristic to get a "good enough" solution. One possible heuristic as described by Wikipedia is to compute a rectilinear minimum spanning tree and then try to improve on the RMST using an iterative improvement method. The RMST itself gives a solution within a constant factor of 1.5 of the true optimum.

kaya3
  • 31,244
  • 3
  • 32
  • 61
  • 1
    There's a more important difference between rectilinear Steiner trees and OPs problem: OP's problem uses dense matrix representation and Steiner trees use sparse matrix representation. In other words, OP's problem is similar to 2d rectilinear Steiner trees but with integer coordinates bounded by O(log n). – Albert Hendriks Jan 27 '20 at 13:28
  • @AlbertHendriks That's a good point. Can you explain the O(log n) bound? – kaya3 Jan 27 '20 at 17:56
  • e.g. if the matrix has width 1024, the rightmost index (1023 or 1111111111 in binary) could be represented using 10 = log_2(1024) bits. So technically I should say that the *size* of the coordinates is bounded by the logarithm of the input size, whereas the size of Steiner tree coordinates is only bounded by (linearly) the input size. – Albert Hendriks Jan 28 '20 at 09:55
  • @AlbertHendriks Yes, that seems correct, thank you! I've edited. I wasn't able to find a definitive answer to whether it's weakly or strongly NP-complete, but in practice if there is no definitive answer then there's no known pseudopolynomial time algorithm to adapt into a polynomial time algorithm to the OP's problem. So I think my conclusion still holds, but I'm a bit less sure about it since I might have missed something in the literature. – kaya3 Jan 28 '20 at 17:53
  • @AlbertHendriks and kaya3, could you please speak to the added requirement here: whereby the rectilinear Steiner tree problem seeks to find the minimal length lines to connect `N` points in the plane, this problem also requires that the total length of the lines and points is at least `k`. – גלעד ברקן Jan 29 '20 at 19:49
  • Another issue is that in this problem, vs a rectilinear Steiner tree, we are not required to connect all points. (@AlbertHendriks and kaya3) – גלעד ברקן Jan 29 '20 at 19:52
  • @גלעד ברקן It's a generalisation (except for the other mentioned differences). Starting from P isolated pixels, there is a rectilinear Steiner tree using S additional pixels if and only if the minimum way to make a hill of size P + S uses S additional pixels. If you only add S pixels then you must connect all P of the original pixels, otherwise the hill won't have size P + S. So a hill-making algorithm can be used to test the existence of a Steiner tree of size S, except for the -1 issue. That's the decision version of the Steiner tree problem. – kaya3 Jan 30 '20 at 02:03
  • Not sure I follow. In the case of a solution that uses a subset of the original pixels, doesn't there have to be a decision as to which ones feature in that subset? Is there literature on that vis-a-vis the rectilinear Steiner tree problem? – גלעד ברקן Jan 30 '20 at 02:20
  • @גלעד ברקן Suppose you are searching for a hill of size P + S and it doesn't use all P of the original pixels, then it must add more than S pixels. That implies there is no Steiner tree of size S, otherwise the hill-making algorithm added more pixels than necessary (except for the -1 issue). – kaya3 Jan 30 '20 at 02:24
  • 1
    Please forgive my obtuseness and thank you for explaining but I still don't see why we need the Steiner tree idea. For example, if `k = 4` and we have one hill of size 2 a pixel away from another pixel and 1000 constellations of hills of size 2, each more than one pixel away from any other hill, we can solve the problem by connecting the one hill that's a pixel away from the single pixel. How is the Steiner tree used here? – גלעד ברקן Jan 30 '20 at 02:43
  • 1
    @גלעדברקן The point is not that you can use a rectilinear Steiner tree algorithm to solve all cases of the hill-making problem; it's the other way round. You can use a hill-making algorithm to (sort of) solve the rectilinear Steiner tree problem, because it's (roughly) a special case of the hill-making problem. That means hill-making in the general case is at least as hard as finding a Steiner tree, which gives some information about what kind of algorithm might or might not work (e.g. any greedy algorithm is definitely not exact). – kaya3 Jan 30 '20 at 05:11
  • Oh, I think I see what you mean. Cool, tx. – גלעד ברקן Jan 30 '20 at 05:31
-1

A hill is composed by four sequences of 1's:

enter image description here

The right sequence is composed of r 'bits', the up sequence has u bits, and so on.

A hill of size k is k= 1 + r + l + u + d (1 central + sequences), where each value is 0 <= v < k.

The problem is combinatorial. For each cell all possible combinations of {r,l,u,d} that satisfy the former relation should be tested.

When testing a combination in a cell, you must count the number of the existing 1 in each value of the combination, they don't "flip". This will also skip early some other combinations.

Ripi2
  • 6,098
  • 1
  • 12
  • 28
  • 4
    1. you're restricting the problem, `adjacent entries of 1 denote “hills”` is different from your statement: `A hill is composed by four sequences of 1's`. 2. The fact the problem **has** a combinatorial solution doesn't mean you **have to** enumerate all combination, **unless you prove otherwise** – fjardon Jun 08 '18 at 07:35
  • @fjardon In the samples the OP posted you see that `1`'s at diagonals are not part of the hill. So it implies that only `1`'s in the sequences count. Notice that a sequence can be void, not `1` at all. – Ripi2 Jun 08 '18 at 12:21
  • Think about a square of `1`. According to OP this is a hill of size 4. Not according to your definition. – fjardon Jun 08 '18 at 12:37
  • @fjardon May be you're right, may be not. The OP's samples are not clear about diagonals. And "adjacent" is not clear enough. – Ripi2 Jun 08 '18 at 12:41
  • Quoting the OP: `adjacent means left-right-up-down neighborhoods`. So it is absolutely clear that a square of `1` *is* a Hill of size 4. – fjardon Jun 08 '18 at 12:44
  • @fjardon Sorry, I can understand it like "left" or "right" or "up" or "down", Not combined like "left+up". But the sentence does not exclude your POV. – Ripi2 Jun 08 '18 at 12:46
  • I think the three examples well-define a hill. I've provided further clarification. I also don't believe that enumerating is the most efficient without sufficient proof. – The Monkey Jun 21 '18 at 18:30
  • To set it clear, is a 2x2 block with four `1` a hill of size 4 or 2 hills of size n? Edit your question with this case. – Ripi2 Jun 21 '18 at 18:35