I have a specific problem in mind, that I want to solve using Knuth's Algorithm X. However, I struggle to translate my problem into suitable contraints, that make up the incidence matrix for Algorithm X to operate on.
For my sports club summer tournament, I want to come up with a schedule that will group four players together, without regrouping any pair of players in subsequent playing rounds.
I figured that this translates nicely into an exact cover problem like this:
- All players are represented as a column of the matrix
- Each group of four players (disregarding their order) is one row in the matrix.
- As such, 1 is written in a matrix cell, when a player is member of that group.
After setting this up for 20 players, I've got an incidence matrix with 20 columns and 4845 rows (969 rows per player/column).
Algorithm X will find a solution just nicely, but this will cover just one (the first) round. Letting the algorithm continue will spit out more alternative solutions for the same round, which is not of interest to me. So I build an iterator around the algorithm, that will take the solution and remove rows from the incidence matrix based on player overlap: Whenever a group from the solution has an intersection of at least 2 with any row of the matrix, that row is removed. After the first run of the algorithm, the matrix is culled down to 1280 rows. Running Algorithm X will find the next solution, etc. until it doesn't anymore.
Cutting this long story short, this approach isn't a proper application of the exact cover problem - I had to find part solutions iteratively. The correct exact cover problem should include the sequence of playing rounds somehow. Why? Because now I do not explore the full range of possible solutions! The player count of 20 is the best example for that. Algorithm X will find solutions for just 3 successive rounds. Yet, I do know that there are at least 5, when different intermediate solutions are chosen. This is precisely the job that I had hoped Algorithm X could address for me. With the above approach, there is no backtracking between playing rounds.
Even though the question is abstract enough that code shouldn't be necessary, here is my implementation of Knuth's DLX (Algorithm X with Dancing Links) in Python:
from itertools import combinations
def dancing_links (players):
"""
Implement the dancing links algorithm as described by Donald Knuth, which
attemts to solve an exact cover problem exhaustively in an efficient way.
Adapted for my use case, I define the incidence matrix as follows:
* Columns are players.
* Rows are groups of players.
* The intersection of groups and players mean that that player is a
member of that group.
* One group contains exactly four players, i.e. each row has
exactly four 1s.
* Repeatedly solve the exact cover problem for a reduced set of groups,
where each round the total set of groups is filtered for illegal
groups. An illegal group features at least two players that
have already played together in a round.
"""
class FoundSolution (Exception):
"Use the exception to abort recursive stacks"
pass
# Dancing links is based on "doubly linked lists" intersecting
# with each other. Python doesn't have this kind of data structure
# and implementing it is quite expensive. E.g. each field of the incidence
# matrix could be a Node which has pointers in all four directions,
# The Node class with 6 attributes (four pointers, a name and arbitrary
# data) needs to undergo countless name lookups, which is a slow process
# in Python. So instead, I represent each node without a class definition
# as a dict.
#
# Since we're walking over so many doubly linked lists, starting from
# any of its nodes, we need to remember where we started and iterate
# through all of them. That clutters our code later on a lot without
# this iterator function.
def iter_dll (start, direction='right'):
next = start[direction]
# Need to explicitly compare object ids. Otherwise Python
# would try to do a deep comparison of two dicts. which is impossible
# due to the circular referencing.
while id(start) != id(next):
yield next
next = next[direction]
def cover (column):
"""
Cover a column by removing its head node from the control row and
removing each of its rows from other columns that intersect.
"""
column['left']['right'] = column['right']
column['right']['left'] = column['left']
for r in iter_dll(column, 'down'):
for c in iter_dll(r):
c['up']['down'] = c['down']
c['down']['up'] = c['up']
def uncover (column):
# Undo the changes caused by a call to cover(dll) by injecting the
# linked nodes with the remembered predecessor and successor.
for r in iter_dll(column, 'up'):
for c in iter_dll(r, 'left'):
c['up']['down'] = c['down']['up'] = c
else:
column['left']['right'] = column['right']['left'] = column
def search (i, root):
if id(root['right']) == id(root):
# The only way to exit the complete recursion stack is an exception.
raise FoundSolution
for c in iter_dll(root):
cover(c)
for r in iter_dll(c, 'down'):
lineup.append(r)
for j in iter_dll(r):
cover(j['data'])
search(i+1, root)
lineup.pop()
for j in iter_dll(r, 'left'):
uncover(j['data'])
else:
uncover(c)
def generate_incidence_matrix (groups):
# The gateway to our web of doubly linked lists.
root = {'name': 'root', 'data': None}
# Close the circle in left and right dirctions, so we can keep the
# circle closed while injecting new nodes.
root['right'] = root['left'] = root
# The control row of column headers is created by attaching each new
# Header with the previous one.
for player in players:
n = {
'name': 'Headnode {}'.format(player),
'data': player,
'right': root,
'left': root['left'],
}
n['up'] = n['down'] = n
root['left']['right'] = root['left'] = n
# Now append nodes to each column header node in our control row -
# one for each player of a distinct group of four players.
rnmbr = 0
# Seed for new row nodes
seed = {'name': 'seed', 'data': None}
for g in groups:
rnmbr += 1
seed['right'] = seed['left'] = seed
# Iterate through the control nodes for each of the four players.
for header in (m for m in iter_dll(root) for p in g if m['data'] == p):
n = {
# Chose a name that identifies the row and colum for this
# new node properly.
'name': 'R-{},C-{}'.format(rnmbr, header['data']),
'data': header,
'up': header['up'],
'down': header,
'left': seed,
'right': seed['right']
}
header['up']['down'] = header['up'] = n
seed['right']['left'] = seed['right'] = n
else:
# Extract the seed from this row
seed['right']['left'] = seed['left']
seed['left']['right'] = seed['right']
return root
groups = tuple(combinations(players, 4))
groups_per_round = len(players)/4
lineups = []
while len(groups) >= groups_per_round:
root = generate_incidence_matrix(groups)
lineup = []
try:
search(0, root)
except FoundSolution:
lineup = reduce(list.__add__, ([r['data']['data']] + [n['data']['data'] for n in iter_dll(r)] for r in lineup))
lineup = tuple(tuple(sorted(lineup[i:i + 4])) for i in xrange(0, len(lineup), 4))
lineups.append(lineup)
groups = tuple(group for group in groups if all(len(g.intersection(set(group))) < 2 for g in (set(s) for s in lineup)))
else:
break
return lineups
Given a list of players, this function will print the intermediate solutions to screen until the options are exhausted. Sadly, it isn't as fast as I'd hoped for, but it was a nice programming exercise for me. :-)
Calling the dancing_links()
function as defined above will yield the following output...
>>> pprint.pprint(dancing_links(range(1,21)))
[((1, 2, 3, 4), (5, 6, 7, 8), (9, 10, 11, 12), (13, 14, 15, 16), (17, 18, 19, 20)),
((1, 5, 9, 13), (2, 6, 10, 17), (3, 7, 14, 18), (4, 11, 15, 19), (8, 12, 16, 20)),
((1, 6, 11, 14), (2, 5, 12, 18), (3, 8, 13, 19), (4, 9, 16, 17), (7, 10, 15, 20))]
What I had expected is more like...
[((1, 2, 3, 4), (5, 6, 7, 8), (9, 10, 11, 12), (13, 14, 15, 16), (17, 18, 19, 20)),
((1, 5, 9, 13), (2, 6, 10, 17), (3, 7, 14, 18), (4, 11, 15, 19), (8, 12, 16, 20)),
((1, 12, 15, 18), (2, 5, 16, 19), (3, 6, 9, 20), (4, 7, 10, 13), (8, 11, 14, 17)),
((1, 7, 11, 20), (2, 8, 13, 18), (3, 5, 10, 15), (4, 9, 16, 17), (6, 12, 14, 19)),
((1, 8, 10, 19), (2, 7, 9, 15), (3, 12, 13, 17), (4, 5, 14, 20), (6, 11, 16, 18))]
Note that it doesn't have to be this exact solution. It is just an example solution that I've found during my attempts to eventually generate a schedule for an arbitrary number of players.