使用正则表达式python获取模式后的句子_Python_Regex

使用正则表达式python获取模式后的句子

python regex

使用正则表达式python获取模式后的句子,python,regex,Python,Regex,在我的字符串（采用的示例）中，我希望在通用（年份）之后的第一个之前获取所有内容。模式： str = 'purple alice@google.com, (2002).blah monkey. (1991).@abc.com blah dishwasher' 我想我的代码就快到了，但还不是很好： test = re.findall(r'[\(\d\d\d\d\).-]+([^.]*)', str) 。。。返回：['com，（2002），'blah monkey'，'（1991），'abc'，'

在我的字符串（采用的示例）中，我希望在通用

（年份）之后的第一个
之前获取所有内容。

模式：

str = 'purple alice@google.com, (2002).blah monkey. (1991).@abc.com blah dishwasher'

我想我的代码就快到了，但还不是很好：

test = re.findall(r'[\(\d\d\d\d\).-]+([^.]*)', str)

。。。返回：

['com，（2002），'blah monkey'，'（1991），'abc'，'com blah dishchiner']

所需输出为：

['blah monkey'，'@abc']

换句话说，我想找到年模式和下一个点之间的所有东西。

如果你想得到

（年）和第一个
之间的所有东西，你可以使用以下方法：
\(\d{4}\)\.([^.]*)

看
并在这里解释：
"\(\d{4}\)\.([^.]*)"g

\( matches the character ( literally
  \d{4} match a digit [0-9]
    Quantifier: {4} Exactly 4 times
       \) matches the character ) literally
         \. matches the character . literally
1st Capturing group ([^.]*)
    [^.]* match a single character not present in the list below
        Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
        . the literal character .
g modifier: global. All matches (don't return on first match)

您正在以错误的方式使用[…]
。尝试使用\（\d{4}\）\（[^.]*）\。
：
>>> s = 'purple alice@google.com, (2002).blah monkey. (1991).@abc.com blah dishwasher'
>>> re.findall(r'\(\d{4}\)\.([^.]*)\.', s)
['blah monkey', '@abc']

对于参考，[…]
指定了一个。通过使用[\（\d\d\d\d\）.-]
你说的是：其中一个0123456789（）.-
，
这应该可以解决问题
print re.findall(r'\(\d{4}\)\.([^\.]+)', str)
$ ['blah monkey', '@abc']

好的解决方案（+1）。另外，我喜欢你的Live Demo
按钮（刚刚潜入你的标记代码；-）@Jan easy设计。我应该写那个脚本；）