Python 将字符串拆分为所有可能的有序短语_Python_String_List

Python 将字符串拆分为所有可能的有序短语

python string list

Python 将字符串拆分为所有可能的有序短语,python,string,list,Python,String,List,我试图探索Python内置函数的功能。我目前正在尝试使用字符串，例如： 'the fast dog' 并将字符串分解为所有可能的有序短语，如列表。上述示例的输出如下所示： [['the', 'fast dog'], ['the fast', 'dog'], ['the', 'fast', 'dog']] 关键是在生成可能的短语时，需要保留字符串中单词的原始顺序我已经能够让一个函数工作，可以做到这一点，但它是相当麻烦和丑陋的。然而，我想知道Python中的一些内置功能是否有用。我在想，可以在

我试图探索Python内置函数的功能。我目前正在尝试使用字符串，例如：

'the fast dog'

并将字符串分解为所有可能的有序短语，如列表。上述示例的输出如下所示：

[['the', 'fast dog'], ['the fast', 'dog'], ['the', 'fast', 'dog']]

关键是在生成可能的短语时，需要保留字符串中单词的原始顺序

我已经能够让一个函数工作，可以做到这一点，但它是相当麻烦和丑陋的。然而，我想知道Python中的一些内置功能是否有用。我在想，可以在不同的空白处拆分字符串，然后递归地将其应用于每个拆分。有人有什么建议吗？

使用：

例如：

>>> for x in break_down('the fast dog'):
...     print(x)
...
['the', 'fast dog']
['the fast', 'dog']
['the', 'fast', 'dog']

>>> for x in break_down('the really fast dog'):
...     print(x)
...
['the', 'really fast dog']
['the really', 'fast dog']
['the really fast', 'dog']
['the', 'really', 'fast dog']
['the', 'really fast', 'dog']
['the really', 'fast', 'dog']
['the', 'really', 'fast', 'dog']

想一想单词之间的间隙。该集合的每个子集对应于一组分割点，最后对应于短语的分割：

the fast dog jumps
   ^1   ^2  ^3     - these are split points

例如，子集

{1,3}

对应于分割

{“the”，“fast dog”，“jumps”}

子集可以作为从1到2^（L-1）-1的二进制数枚举，其中L是字数

001 -> the fast dog, jumps
010 -> the fast, dog jumps
011 -> the fast, dog, jumps
etc.

您请求的操作通常称为“分区”，可以在任何类型的列表上完成。因此，让我们实现任何列表的分区：

def partition(lst):
    for i in xrange(1, len(lst)):
        for r in partition(lst[i:]):
            yield [lst[:i]] + r
    yield [lst]

请注意，对于较长的列表，将有许多分区，因此最好将其实现为生成器。要检查它是否有效，请尝试：

print list(partition([1, 2, 3]))

现在，您需要使用单词作为元素对字符串进行分区。执行此操作的最简单方法是按单词拆分文本，运行原始分区算法，然后将单词组合并回字符串：

def word_partition(text):
    for p in partition(text.split()):
        yield [' '.join(group) for group in p]

同样，要测试它，请使用：

print list(word_partition('the fast dog'))

我将详细介绍一下@grep的解决方案，同时只使用您在问题中提到的内置函数，避免递归。你可能会按照以下思路实现他的答案：

#! /usr/bin/python3

def partition (phrase):
    words = phrase.split () #split your phrase into words
    gaps = len (words) - 1 #one gap less than words (fencepost problem)
    for i in range (1 << gaps): #the 2^n possible partitions
        r = words [:1] #The result starts with the first word
        for word in words [1:]:
            if i & 1: r.append (word) #If "1" split at the gap
            else: r [-1] += ' ' + word #If "0", don't split at the gap
            i >>= 1 #Next 0 or 1 indicating split or don't split
        yield r #cough up r

for part in partition ('The really fast dog.'):
    print (part)

#/usr/bin/python3
def分区（短语）：
单词=短语。拆分（）#将短语拆分为单词
间隙=长度（单词）-1#比单词少一个间隙（栅栏柱问题）
对于i in range（1），您最好的选择是将列表拆分成一个列表，然后找到一些函数来获取该列表，并按照您需要的行生成列表列表。这是一个列表问题，而不是字符串或拆分问题。此外，您可能希望澄清什么是“短语”；从你的例子来看，一个短语似乎是任意两个词。我认为他实际上试图实现的是所有可能的单分裂和多分裂（保持秩序）。什么是有序短语？你真的在问“在一个短语中创建所有可能的词的组合”吗
#! /usr/bin/python3

def partition (phrase):
    words = phrase.split () #split your phrase into words
    gaps = len (words) - 1 #one gap less than words (fencepost problem)
    for i in range (1 << gaps): #the 2^n possible partitions
        r = words [:1] #The result starts with the first word
        for word in words [1:]:
            if i & 1: r.append (word) #If "1" split at the gap
            else: r [-1] += ' ' + word #If "0", don't split at the gap
            i >>= 1 #Next 0 or 1 indicating split or don't split
        yield r #cough up r

for part in partition ('The really fast dog.'):
    print (part)