Python:从具有字符限制的列表中生成所有可能的序列组合

我的问题与this question完全相同.我有字符的数组(列表).我想从该列表中获取所有可能的序列组合,但具有字符限制(例如:最多2个字符).此外,在排列行中不能重复单个字符:

chars = ['a', 'b', 'c', 'd']

# output
output = [['a', 'b', 'c', 'd'],
          ['ab', 'c', 'd'],
          ['a', 'bc', 'd'],
          ['a', 'b', 'cd'],
          ['ab', 'cd'],
          ['abc', 'd'], # this one will be exempted
          ['a', 'bcd'],  # this one will be exempted
          ['abcd']]  # this one will be exempted

我知道我可以在生成和构建序列时检查条件以省略超限字符组合.但它会增加运行时间.我的目的是减少现有的执行时间.

没有字符数限制,组合将生成为2 ^(N-1).如果列表超过15个字符,则执行程序将花费很长时间.因此,我想减少字符数限制的组合数.

优先级是性能.我已经研究并试了两天没有任何成功.

最佳答案
一种方法是迭代输入列表并逐步构建组合.在每个步骤中,从输入列表中取出下一个字符并将其添加到先前生成的组合中.

from collections import defaultdict

def make_combinations(seq, maxlen):
    # memo is a dict of {length_of_last_word: list_of_combinations}
    memo = defaultdict(list)
    memo[1] = [[seq[0]]]  # put the first character into the memo

    seq_iter = iter(seq)
    next(seq_iter)  # skip the first character
    for char in seq_iter:
        new_memo = defaultdict(list)

        # iterate over the memo and expand it
        for wordlen, combos in memo.items():
            # add the current character as a separate word
            new_memo[1].extend(combo + [char] for combo in combos)

            # if the maximum word length isn't reached yet, add a character to the last word
            if wordlen < maxlen:
                word = combos[0][-1] + char

                new_memo[wordlen+1] = newcombos = []
                for combo in combos:
                    combo[-1] = word  # overwrite the last word with a longer one
                    newcombos.append(combo)

        memo = new_memo

    # flatten the memo into a list and return it
    return [combo for combos in memo.values() for combo in combos]

输出:

[['a', 'b', 'c', 'd'], ['ab', 'c', 'd'], ['a', 'bc', 'd'],
 ['a', 'b', 'cd'], ['ab', 'cd']]

对于短输入,此实现比天真的itertools.product方法慢:

input: a b c d
maxlen: 2
iterations: 10000

itertools.product: 0.11653625800136069 seconds
make_combinations: 0.1657387
from collections import defaultdict

def make_combinations(seq, maxlen):
    # memo is a dict of {length_of_last_word: list_of_combinations}
    memo = defaultdict(list)
    memo[1] = [[seq[0]]]  # put the first character into the memo

    seq_iter = iter(seq)
    next(seq_iter)  # skip the first character
    for char in seq_iter:
        new_memo = defaultdict(list)

        # iterate over the memo and expand it
        for wordlen, combos in memo.items():
            # add the current character as a separate word
            new_memo[1].extend(combo + [char] for combo in combos)

            # if the maximum word length isn't reached yet, add a character to the last word
            if wordlen < maxlen:
                word = combos[0][-1] + char

                new_memo[wordlen+1] = newcombos = []
                for combo in combos:
                    combo[-1] = word  # overwrite the last word with a longer one
                    newcombos.append(combo)

        memo = new_memo

    # flatten the memo into a list and return it
    return [combo for combos in memo.values() for combo in combos]

41118 seconds

但是当输入列表更长时,它会快速恢复:

input: a b c d e f g h i j k
maxlen: 2
iterations: 10000

itertools.product: 6.9087735799985240 seconds
make_combinations: 1.2037671390007745 seconds

转载注明原文:Python:从具有字符限制的列表中生成所有可能的序列组合 - 代码日志