python多行正则表达式_Python_Regex

python多行正则表达式

python regex

python多行正则表达式,python,regex,Python,Regex,如何提取所有字符（包括换行符），直到给出者序列的单词第一次出现？例如，使用以下输入：输入文本： "shantaram is an amazing novel. It is one of the best novels i have read. the novel is written by gregory david roberts. He is an australian" 以及序列the 我想将文本从shantaram提取到第二行的的第一个匹配项输出必须是- shantaram is

如何提取所有字符（包括换行符），直到给出者序列的单词第一次出现？例如，使用以下输入：

输入文本：

"shantaram is an amazing novel.
It is one of the best novels i have read.
the novel is written by gregory david roberts.
He is an australian"

以及序列

the

我想将文本从

shantaram

提取到第二行的

的第一个匹配项
输出必须是-
shantaram is an amazing novel.
It is one of the

我整个上午都在努力。我可以编写表达式来提取所有字符，直到遇到特定字符，但如果我使用以下表达式：
re.search("shantaram[\s\S]*the", string)

它在换行符之间不匹配。
您想使用DOTALL
选项在换行符之间匹配。发件人：
雷多塔尔
使“.”特殊字符完全匹配任何字符，包括换行符；如果没有此标志，“.”将匹配除换行以外的任何内容
演示：
不使用正则表达式的解决方案：
from itertools import takewhile
def upto(a_string, stop):
    return " ".join(takewhile(lambda x: x != stop and x != "\n".format(stop), a_string))

使用这个正则表达式
re.search("shantaram[\s\S]*?the", string)

而不是
re.search("shantaram[\s\S]*the", string)

唯一的区别是“？”。通过使用“？”（例如。*？，+？），您可以防止最长匹配。您尝试过什么吗？“询问代码的问题必须表明对正在解决的问题的最低理解。包括尝试过的解决方案，为什么不起作用，以及预期的结果”我从早上开始一直在尝试。我可以编写表达式来提取所有字符，直到它遇到特定字符。但是在这里，如果我使用像-re.search（“shantaram[\s\s]*the”，string）这样的表达式，它将不起作用，因为它是[\s\s]的一部分，并且提取不会发生
re.search("shantaram[\s\S]*the", string)