Welcome to Software Development on Codidact!
Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.
Post History
Using a recursive generator There isn't anything like a built-in solution for this algorithm, but it's easy to express the desired output recursively. A valid output combination of k-many inputs l...
Answer
#2: Post edited
- ## Using a recursive generator
- There isn't anything like a built-in solution for this algorithm, but it's easy to express the desired output recursively. A valid output combination of *k*-many inputs looks like an arbitrarily selected first element, followed by a combination of *k-1*-many elements that:
- * appear *after* the first element within the original input sequence (to distinguish combinations from permutations)
- * don't share a common *i*th value, for `0 <= i < n`
- Thus, we can filter the input sequence according to those criteria, then use recursion to get the possibilities for the remaining elements, and prepend our chosen first element. By iterating over possible first elements within the recursion, we get every desired result.
- Accumulating a list of the results would be annoying with recursion, but in Python we can write a recursive *generator* to avoid the problem. We iterate over possibilities `yield`ed from a recursive call to prepend a first element, and then iterate over possible first elements.
- To clarify the logic, we can use a helper function to do the filtering.
- The resulting implementation:
- ```
- def no_common_elements(candidates, word):
- return [
- c for c in candidates
- if all(cl != wl for cl, wl in zip(c, word))
- ]
- def distinct_word_combinations(candidates, k):
- if len(candidates) < k:
- return # avoid doing tons of recursion through hopeless cases
- if k == 0: # base case
- yield ()
- return
- for i, first in enumerate(candidates):
- new_candidates = no_common_elements(candidates[i+1:], first)
- for result in distinct_word_combinations(new_candidates, k-1):
- yield (first,) + result
- ```
This gives up the performance benefits that `itertools.combinations` can offer by iterating and/or recursing at the C level, but it prunes branches ahead of time so that everything is generated directly. Although this is still much slower in my tests than the naive approach for 4 words, it's only about 15 times as slow as for 3 words, rather than 46 times as slow. For really big problem sizes it will win out.Unfortunately, this can't recognize the case where *k* exceeds the number of possibilities for one of the elements. That needs to be done separately and explicitly, which involves first finding (or knowing) the set of possibilities in each position.
- ## Using a recursive generator
- There isn't anything like a built-in solution for this algorithm, but it's easy to express the desired output recursively. A valid output combination of *k*-many inputs looks like an arbitrarily selected first element, followed by a combination of *k-1*-many elements that:
- * appear *after* the first element within the original input sequence (to distinguish combinations from permutations)
- * don't share a common *i*th value, for `0 <= i < n`
- Thus, we can filter the input sequence according to those criteria, then use recursion to get the possibilities for the remaining elements, and prepend our chosen first element. By iterating over possible first elements within the recursion, we get every desired result.
- Accumulating a list of the results would be annoying with recursion, but in Python we can write a recursive *generator* to avoid the problem. We iterate over possibilities `yield`ed from a recursive call to prepend a first element, and then iterate over possible first elements.
- To clarify the logic, we can use a helper function to do the filtering.
- The resulting implementation:
- ```
- def no_common_elements(candidates, word):
- return [
- c for c in candidates
- if all(cl != wl for cl, wl in zip(c, word))
- ]
- def distinct_word_combinations(candidates, k):
- if len(candidates) < k:
- return # avoid doing tons of recursion through hopeless cases
- if k == 0: # base case
- yield ()
- return
- for i, first in enumerate(candidates):
- new_candidates = no_common_elements(candidates[i+1:], first)
- for result in distinct_word_combinations(new_candidates, k-1):
- yield (first,) + result
- ```
- This gives up the performance benefits that `itertools.combinations` can offer by iterating and/or recursing at the C level, but it prunes branches ahead of time so that everything is generated directly.
- In my testing, this is slower for short combinations, but (as expected) much faster for the hard cases:
- ```
- >>> timeit.timeit("brute_force(two, 3)", globals=globals(), number=1)
- 0.011705408978741616
- >>> timeit.timeit("list(distinct_word_combinations(two, 3))", globals=globals(), number=1)
- 0.015882118023000658
- >>> timeit.timeit("brute_force(two, 8)", globals=globals(), number=1)
- 51.998020388011355
- >>> timeit.timeit("list(distinct_word_combinations(two, 8))", globals=globals(), number=1)
- 0.048423305968753994
- ```
- When given a larger input set, however, this approach gets bogged down very easily. With a full set of two-letter words, the brute force approach is about 6 times as fast, and the original version that's specific to *n* and *k* values is much faster still. Even with n=4, the recursive generator still loses handily.
- This approach also, unfortunately, can't recognize the case where *k* exceeds the number of possibilities for one of the elements. With the restricted alphabet, this isn't a problem; but while a human can instantly tell that there are no 27-word combinations meeting the condition among the full set of 127 candidate words, this algorithm will still take impossibly long to reach the same conclusion.
#1: Initial revision
## Using a recursive generator There isn't anything like a built-in solution for this algorithm, but it's easy to express the desired output recursively. A valid output combination of *k*-many inputs looks like an arbitrarily selected first element, followed by a combination of *k-1*-many elements that: * appear *after* the first element within the original input sequence (to distinguish combinations from permutations) * don't share a common *i*th value, for `0 <= i < n` Thus, we can filter the input sequence according to those criteria, then use recursion to get the possibilities for the remaining elements, and prepend our chosen first element. By iterating over possible first elements within the recursion, we get every desired result. Accumulating a list of the results would be annoying with recursion, but in Python we can write a recursive *generator* to avoid the problem. We iterate over possibilities `yield`ed from a recursive call to prepend a first element, and then iterate over possible first elements. To clarify the logic, we can use a helper function to do the filtering. The resulting implementation: ``` def no_common_elements(candidates, word): return [ c for c in candidates if all(cl != wl for cl, wl in zip(c, word)) ] def distinct_word_combinations(candidates, k): if len(candidates) < k: return # avoid doing tons of recursion through hopeless cases if k == 0: # base case yield () return for i, first in enumerate(candidates): new_candidates = no_common_elements(candidates[i+1:], first) for result in distinct_word_combinations(new_candidates, k-1): yield (first,) + result ``` This gives up the performance benefits that `itertools.combinations` can offer by iterating and/or recursing at the C level, but it prunes branches ahead of time so that everything is generated directly. Although this is still much slower in my tests than the naive approach for 4 words, it's only about 15 times as slow as for 3 words, rather than 46 times as slow. For really big problem sizes it will win out. Unfortunately, this can't recognize the case where *k* exceeds the number of possibilities for one of the elements. That needs to be done separately and explicitly, which involves first finding (or knowing) the set of possibilities in each position.