在python中使用正则表达式拆分字符串_Python_Regex_Split

在python中使用正则表达式拆分字符串

python regex

在python中使用正则表达式拆分字符串,python,regex,split,Python,Regex,Split,像这样拆分字符串的最佳方法是什么 text = "hello there how are you" 用Python 所以我最终会得到这样一个数组： ['hello there', 'there how', 'how are', 'are you'] 我试过这个： liste = re.findall('((\S+\W*){'+str(2)+'})', text) for a in liste: print(a[0]) 但我得到了： hello there how are you

像这样拆分字符串的最佳方法是什么

text = "hello there how are you"

用Python

所以我最终会得到这样一个数组：

['hello there', 'there how', 'how are', 'are you']

我试过这个：

liste = re.findall('((\S+\W*){'+str(2)+'})', text)
for a in liste:
    print(a[0])

但我得到了：

hello there 
how are 
you

如何使

findall

函数在搜索时只移动一个令牌？

如果不需要正则表达式，您可以执行以下操作：

l=text.split（“”）
out=[]
对于范围内的i（len（l））：
尝试：
o、 附加（l[i]+''+l[i+1]）
除索引器外：
持续

说明：

首先拆分空格字符上的字符串。结果将是一个列表，其中每个元素都是句子中的一个单词。实例化一个空列表以保存结果。循环单词列表，将两个由空格分隔的单词组合添加到输出列表中。这将在访问列表中的最后一个单词时抛出一个索引器，只需捕获它并继续，因为您似乎并不希望结果中只有一个单词。

如果不需要正则表达式，您可以执行以下操作：

l=text.split（“”）
out=[]
对于范围内的i（len（l））：
尝试：
o、 附加（l[i]+''+l[i+1]）
除索引器外：
持续

说明：

首先拆分空格字符上的字符串。结果将是一个列表，其中每个元素都是句子中的一个单词。实例化一个空列表以保存结果。循环单词列表，将两个由空格分隔的单词组合添加到输出列表中。当访问列表中的最后一个单词时，这将抛出一个索引器，只需捕获它并继续，因为您似乎并不希望结果中只有一个单词。

我认为您实际上不需要正则表达式。
我知道您想要一个列表，其中每个元素包含两个单词，后者也是以下元素的前者。我们可以像这样轻松地做到这一点：

string=“你好，你好”
liste=string.split（“”.pop（-1）
#我们删除最后一个索引，否则我们将崩溃，或者只有一个单词的元素
对于范围内的i（len（liste）-1）：
liste[i]=liste[i]+“”+liste[i+1]

我认为您实际上不需要正则表达式来完成此操作。
我知道您想要一个列表，其中每个元素包含两个单词，后者也是以下元素的前者。我们可以像这样轻松地做到这一点：

string=“你好，你好”
liste=string.split（“”.pop（-1）
#我们删除最后一个索引，否则我们将崩溃，或者只有一个单词的元素
对于范围内的i（len（liste）-1）：
liste[i]=liste[i]+“”+liste[i+1]

我不知道是否必须使用正则表达式，但我会这样做

首先，可以使用

str.split（）

方法获取单词列表

>>句子=“你好，你好吗”
>>>拆分的句子=句子。拆分（“”）
>>>分句
[“你好”，“你好”，“你好”，“你”]

然后，你可以配对

>>输出=[]
>>>对于范围内的i（1，len（分裂的句子））：
...     输出+=[拆分的[i-1]+''+拆分的句子[i]]
...
输出
[‘你好’、‘你好’、‘你好吗’、‘你好吗’]

我不知道是否必须使用正则表达式，但我会这样做

首先，可以使用

str.split（）

方法获取单词列表

>>句子=“你好，你好吗”
>>>拆分的句子=句子。拆分（“”）
>>>分句
[“你好”，“你好”，“你好”，“你”]

然后，你可以配对

>>输出=[]
>>>对于范围内的i（1，len（分裂的句子））：
...     输出+=[拆分的[i-1]+''+拆分的句子[i]]
...
输出
[‘你好’、‘你好’、‘你好吗’、‘你好吗’]
这里有一个关于re.findall
的解决方案：
>>重新导入
>>>text=“你好，你好”
>>>关于findall（r“（？=（？：（？：^ |\W）（\S+\W\S+（$））”，文本）
[‘你好’、‘你好’、‘你好吗’、‘你好吗’]

请查看re
的Python文档：

（？=…）
先行断言
（？：…）
非捕获正则括号
这里有一个关于re.findall
的解决方案：
>>重新导入
>>>text=“你好，你好”
>>>关于findall（r“（？=（？：（？：^ |\W）（\S+\W\S+（$））”，文本）
[‘你好’、‘你好’、‘你好吗’、‘你好吗’]

请查看re
的Python文档：

（？=…）
先行断言
（？：…）
非捕获正则括号
另一种可能的解决方案是使用findall

>>> liste = list(map(''.join, re.findall(r'(\S+(?=(\s+\S+)))', text)))
>>> liste
['hello there', 'there how', 'how are', 'are you']

另一种可能的解决方案是使用findall

>>> liste = list(map(''.join, re.findall(r'(\S+(?=(\s+\S+)))', text)))
>>> liste
['hello there', 'there how', 'how are', 'are you']

另一种选择是，只需将拆分
，压缩
，然后像这样加入即可
sentence = "Hello there how are you"
words = sentence.split()
[' '.join(i) for i in zip(words, words[1:])]

另一种选择是，只需将拆分
，压缩
，然后像这样加入即可
sentence = "Hello there how are you"
words = sentence.split()
[' '.join(i) for i in zip(words, words[1:])]