Python 在某些单词后拆分或分割字符串_Python_Regex_String_Split

Python 在某些单词后拆分或分割字符串

python regex string

Python 在某些单词后拆分或分割字符串,python,regex,string,split,Python,Regex,String,Split,首先我要说的是，在问这个问题之前，我已经在谷歌上搜索了好几个小时，如果我选择在这里发帖，我会非常绝望我有几个字符串的格式如下（近似）：我需要在'firstword'之后和'ONE'或'TWO'之前提取文本因此，我对上述字符串的输出必须是： "text" 如何分割字符串，以便：删除第一个单词（我已经知道如何使用str.split（“”））执行此操作）保留“一”或“二”之前的文本。（我想它应该看起来像str.split（'ONE'| “两个”），但这显然不起作用，我还没有找到现在有一

首先我要说的是，在问这个问题之前，我已经在谷歌上搜索了好几个小时，如果我选择在这里发帖，我会非常绝望

我有几个字符串的格式如下（近似）：

我需要在

'firstword'

之后和

'ONE'

或

'TWO'

之前提取

文本
因此，我对上述字符串的输出必须是：
"text"

如何分割字符串，以便：

删除第一个单词（我已经知道如何使用str.split（“”））执行此操作）
保留“一”或“二”之前的文本。（我想它应该看起来像str.split（'ONE'|
“两个”），但这显然不起作用，我还没有找到
现在有一个解决办法

如果可能的话，我想用split（）
或partition（）
来解决这个问题，但是正则表达式也可以
感谢您的帮助，如果这是一个愚蠢的问题，请道歉。
您可以使用这个正则表达式，它可以进行正向前瞻和正向向后前瞻
(?<=firstword)\s*(.*?)\s*(?=ONE|TWO)

（？会吃掉任何空白
（.*）
-->捕获您想要的数据
\s*
-->会吃掉任何空白
（？=ONE | TWO）
-->正向前瞻，以确保匹配的文本后跟一个或两个
当你用空格分隔时，你有一个所有单词的列表，然后你可以选择你想要的单词：
s = "firstword text TWO lastword"
l = s.split(" ") # l = ["firstword" , "text" , "TWO" , "lastword"]
print l[1] # l[1] = "text"

或
试试这个
str_list = ["firstword text ONE lastword","firstword text TWO lastword","any text u entered before firstword text ONE","firstword text TWO any text After"]
end_key_lst = ['ONE','TWO']
print map(lambda x:x.split('firstword')[-1].strip(),[''.join(val.split(end_key)[:-1]) for val in str_list for i,end_key in enumerate(end_key_lst) if end_key in val.split()])

Result:['text', 'text', 'text', 'text']

我如何做到这一点：
可能你们有很多这样的字符串，所以我把它们放在列表中，把我们的结束键排列成一个，两个在一个列表中。
我使用列表压缩和映射功能来获得我们想要的目标列表。
您可以使用正则表达式，如：
import re
string = "firstword text TWO lastword"
re.search('firstword\s+(\w+)\s+[ONE|TWO]', string).group(1)
'text'

实际上不需要使用正则表达式。您可以将所需的分隔符存储到列表中，然后检查它们是否存在
orig_text = "firstword text ONE lastword"

first_separator = "firstword"
#Place all "end words" here
last_separators = ["ONE", "TWO"]

output = []

#Splitting the original text into list
orig_text = orig_text.split(" ")

#Checking if there's the "firstword" just in case
if first_separator in orig_text:
    #Here we check if there's "ONE" or "TWO" in the text
    for i in last_separators:
        if i in orig_text:
            #taking everything between "firstword" and "ONE"/"TWO"
            output = orig_text[orig_text.index(first_separator)+1 : orig_text.index(i)]
            break

#Converting to string
output = " ".join(output)

print(output)

以下是一个输出示例：
"firstword text TWO lastword" -> "text"
"firstword hello world ONE" -> "hello world"
"first text ONE" -> ""
"firstword text" -> ""

这个问题的可能重复之处在于，我的字符串可以在1
或2
之后有任何长度。我希望删除1
或2
之后的所有内容，可能是1个字或10个字。很抱歉，没有更具体。我正在使用的字符串的一个更现实的示例是de>firstword text一个需要删除的额外文本
这确实是一个很好的解决方案。我将接受它作为解决我的特定查询的答案。这确实让我想知道如何使用split（）
或partition（）解决这个问题。可能吗？
orig_text = "firstword text ONE lastword"

first_separator = "firstword"
#Place all "end words" here
last_separators = ["ONE", "TWO"]

output = []

#Splitting the original text into list
orig_text = orig_text.split(" ")

#Checking if there's the "firstword" just in case
if first_separator in orig_text:
    #Here we check if there's "ONE" or "TWO" in the text
    for i in last_separators:
        if i in orig_text:
            #taking everything between "firstword" and "ONE"/"TWO"
            output = orig_text[orig_text.index(first_separator)+1 : orig_text.index(i)]
            break

#Converting to string
output = " ".join(output)

print(output)

"firstword text TWO lastword" -> "text"
"firstword hello world ONE" -> "hello world"
"first text ONE" -> ""
"firstword text" -> ""