4

Stock algorithms for enumerating all subsets of size k (from a set of size N) (e.g. as described here: generate all subsets of size k from a set) tend to use a "lexicographic" order, in which the leftmost element varies slowest. I've also found an algorithm that minimizes the difference between successive subsets in the enumeration, kinda like a Gray code.

I would like instead at each step to generate a subset which is maximally different from all preceding subsets. (This is not the same as "maximize difference between successive subsets" as in the previous formulation of the question.) For instance, considering subsets of size 4 from a set of size 8, one acceptable order begins

ABCD
    EFGH
AB    GH
  CDEF
AB  EF
  CD  GH

Note that the base set is large enough that holding nCk items in memory is impractical.

Community
  • 1
  • 1
zwol
  • 121,956
  • 33
  • 219
  • 328

1 Answers1

1

In your desired output the number of elements that differ from one subset to the next gives the sequence 2,1,2,1,2. I get the same sequence by selecting, from the lexicographically-ordered list of subsets, the first, then the last, then the second, then the second last, etc. At each step choose the subset which is furthest away in the order and which has not already been chosen.

I don't get the same sequence of subsets, just the same sequence of numbers of differences.

I've satisfied myself that this works for a couple of other small cases too and am now looking forward to the counter-examples and down votes.

Ahh, so you don't want to rely on building the lexicographically-ordered set of subsets first. My initial thought is to have 2 subset generators running at the same time, one starting at the first subset (eg AB) and going forward, the other starting at the last (eg CD) and going backwards. If you get what I mean.

High Performance Mark
  • 74,067
  • 7
  • 97
  • 147
  • You have made me realize that I formulated the problem incorrectly. Please see edit. (The difference only becomes apparent with larger sets than I had in the example. Your solution doesn't provide sufficiently large gaps between set *n* and set *n* + 2.) – zwol Jul 16 '13 at 18:34