7

I'm getting quite interested in search algorithms and backtrack programming. For now, I have implemented Algorithm X (see my other post here: Determine conflict-free sets? ) to solve an exact cover problem. This works very well but I'm now interested in solving this with a more basic variant of backtracking. I just can't figure out how this can be done. The problem description is the same as before:

Suppose you have a bunch of sets, whereas each set has a couple of subsets.

Set1 = { (banana, pineapple, orange), (apple, kale, cucumber), (onion, garlic) }

Set2 = { (banana, cucumber, garlic), (avocado, tomato) }

...

SetN = { ... }

The goal now is to select one subset from each set, whereas each subset must be conflict free with any other selected subset (one element is not contained in any of the other chosen subsets).

As an example, I wrote two Java classes that. The sets are identified by a single character and the elements are plain numbers.

I specifically have two problems:

  • How to iterate over all sets/subsets in such a way that the use of a heuristic is possible (choose subset with minimum elements, minimum cost, ...)
  • How to maintain the selected sets+subsets and their contained elements.

All other examples I can find are either Sudoku or n-Queens and are all using fixed for-loops. In addition to that, I was thinking about a rather general approach where a function "isPossiblePartialSolution" may be used to check, if a chosen path/set may be in conflict with a previously chosen subset/element.

Here are the two Java classes:

import java.util.ArrayList;

public class Main {

public static void main(String[] args) {

    ArrayList<Set> allSets = buildRandomTest();

    for(Set r : allSets) {
        System.out.print("Set with id: " + r.id + " is subset in collection: " + r.name + " and contains: ");
        for(Integer n : r.listOfElements) {
            System.out.print(" " + n + " ");
        }
        System.out.println();
    }

}

public static int myRandomRange(int low, int high) {
    return (int) (Math.random() * (++high - low) + low);
}

public static ArrayList<Set> buildRandomTest() {

    ArrayList<Set> mySet = new ArrayList<Set>();

    int numberOfSets = myRandomRange(10, 12);

    for(int i=0; i<numberOfSets; i++) {
        String nameOfSet = Character.toString((char) myRandomRange(65, 67));
        Set tmp = new Set(nameOfSet, i);

        ArrayList<Integer> listOfElements = new ArrayList<Integer>();
        int elementsInList = myRandomRange(2, 4);

        for(int j=0; j<elementsInList; j++) {
            listOfElements.add(myRandomRange(1,30));
        }

        tmp.listOfElements = listOfElements;
        mySet.add(tmp);
    }

    return mySet;
}

}

and

import java.util.ArrayList;

public class Set {

public String name;
public int id;
public ArrayList<Integer> listOfElements;

public Set(String name, int id) {
    this.name = name;
    this.id = id;
    listOfElements = new ArrayList<Integer>();
}

}
Community
  • 1
  • 1
user26372
  • 279
  • 1
  • 2
  • 14

1 Answers1

2

EDIT: Actually it sounds like you've already implemented Dancing Links (using the name "Algorithm X"), so I'm not sure what you're asking for. By "a more basic variant of backtracking", do you mean "a slower variant"? Dancing Links is about as basic as you can get....

ORIGINAL ANSWER: If I were doing this, I'd try to reduce it to an exact-cover problem, which could be solved with Dancing Links. I.e., construct a matrix of 0s and 1s, find a subset of its rows such that there is exactly one 1 in each column, and then convert that row-set back into an answer to your original problem.

The following answer is written in C++(11), but hopefully you can see how to translate it to Java. Implementing Dancing Links in Java is left as an exercise for the reader and/or your search engine of choice.

enum Element {
    apple, avocado, banana, cucumber, garlic,
    kale, onion, orange, pineapple, NUM_ELEMENTS
};

std::vector<std::vector<std::set<Element>>> sets = {
    { {banana, pineapple, orange}, {apple, kale, cucumber}, {onion, garlic} },
    { {banana, cucumber, garlic}, {avocado, tomato} },
    ...
};

int rows, columns;

// One row per subset, obviously...
rows = 0;
for (auto &vs : sets) {
    rows += vs.size();
}
// ...plus N more rows for "vegetable X is not in any selected subset".
rows += NUM_ELEMENTS;

// One column per vegetable, obviously...
columns = NUM_ELEMENTS;
// ...plus N more columns for "we've chosen a subset from set X".
columns += sets.size();

Matrix M(rows, columns);

M.initialize_to_all_zeros();

int r = 0;
for (int i=0; i < sets.size(); ++i) {
    for (int j=0; j < sets[i].size(); ++j) {
        auto &subset = sets[i][j];
        M[r][NUM_ELEMENTS + i] = 1;  // the subset is a member of set i
        for (Element veg : subset) {
            M[r][veg] = 1;  // the subset contains this element
        }
        ++r;
    }
}
for (Element veg = apple; veg < NUM_ELEMENTS; ++veg) {
    M[r][veg] = 1;
    ++r;
}

// Now perform Dancing Links on the matrix to compute an exact cover.
std::set<int> row_indices = dancing_links(M);

// Now print out the subsets.
r = 0;
for (int i=0; i < sets.size(); ++i) {
    for (int j=0; j < sets[i].size(); ++j) {
        if (row_indices.find(r) != row_indices.end()) {
            print_set(sets[i][j]);
        }
        ++r;
    }
}
// And print the unused vegetables, just for kicks.
for (Element veg = apple; veg < NUM_ELEMENTS; ++veg) {
    if (row_indices.find(r) != row_indices.end()) {
        std::cout << "Vegetable " << veg << " was not mentioned above.\n";
    }
    ++r;
}
Quuxplusone
  • 19,419
  • 5
  • 72
  • 137
  • The idea of backtracking is quite general and can be applied to many problems. Dancing Links can only be applied to exact cover problems. It should be possible to implement this using a 'normal' backtracking approach (I know, this will be slower compared to DLX!). From my understanding, we need a function then which tells us if there is a conflict or not with any of the previous decisions. In addition to that, this allows to use a different heuristic! – user26372 May 13 '13 at 09:04
  • Using a different heuristic is what I'm trying to achieve. Just image, 'buying' one set or another is cheaper (so unlike Dancing Links, we would pick the set with minimum cost and not the one, with minimum elements). This can't be done using Dancing Links. – user26372 May 13 '13 at 09:08
  • @user26372 Dancing Links *typically* uses the heuristic "check out the column with the fewest 1s first" (i.e., try the rows that will satisfy the most difficult constraint first), but you can certainly use a different heuristic if you don't like that one. See [my own Dancing Links implementation in C, here](http://www.club.cc.cmu.edu/~ajo/free-software/snippets/dance.c), and imagine how you might change the code under `#if USE_HEURISTIC` to use a different heuristic. – Quuxplusone May 13 '13 at 17:40
  • It won't work out because a column does not say anything about the affected rows. For using some customized metric, you would have to choose rows based on some property instead of columns. I'm indeed looking for a variant that will be slower than Dancing Links because I would like to use some customized heuristics. It should be possible to iterate over all sets like it is done for Sudoku, n-Queens, etc without actually using dancing links. – user26372 May 13 '13 at 19:04
  • Check out the algorithm description (in the box) [here](http://en.wikipedia.org/wiki/Knuth%27s_Algorithm_X). Notice that steps 2 and 3 can be replaced with the single step "Choose a row *r* (nondeterministically)." Implement this in your Dancing Links solver (i.e., always delete rows in the order *row 1, row 2, row 3...*, ordered by whatever "cost" function you like). – Quuxplusone May 13 '13 at 19:23