Python Regex-在字符串中查找日期模式_Python_Regex_Date

Python Regex-在字符串中查找日期模式

python regex date

Python Regex-在字符串中查找日期模式,python,regex,date,Python,Regex,Date,我正在努力使用python正则表达式，感觉我的问题很简单，但我被卡住了。我尝试在字符串中使用严格格式YYYY-MM-DD标识日期子字符串。很简单。但我想确保正则表达式没有检测到假阳性结果。我需要继续的一些源字符串示例： string1='foo2012-09-2018-09-03foo' string2='2012-09-2018-09-03' 我想提取表示日期字符串的字符串2018-09-03和not此字符串2012-09-20。我试过各种款式。基本的一点是： import re st

我正在努力使用python正则表达式，感觉我的问题很简单，但我被卡住了。我尝试在字符串中使用严格格式YYYY-MM-DD标识日期子字符串。很简单。但我想确保正则表达式没有检测到假阳性结果。我需要继续的一些源字符串示例：

string1='foo2012-09-2018-09-03foo'
string2='2012-09-2018-09-03'

我想提取表示日期字符串的字符串

2018-09-03

和not此字符串

2012-09-20

。我试过各种款式。基本的一点是：

import re
string1='foo2012-09-2018-09-03foo'
string2='2012-09-2018-09-03'
pattern  = '[\d]{4}[-_.][\d]{2}[-_.][0-3][\d]'
for match in re.finditer(pattern, string1):
    print(match)
    # FAIL : <re.Match object; span=(3, 13), match='2012-09-20'>
for match in re.finditer(pattern, string2):
    print(match)
    # FAIL : <re.Match object; span=(0, 10), match='2012-09-20'>

第二个不起作用，因为在

string2

中，我要查找的子字符串后面没有字符。有没有一种方法可以调整模式，即查找后跟非十进制数字或字符串结尾的日期

附言：第一篇帖子

多亏了joanis，答案是否定的前瞻：

import re
pattern  = '(?<!\d)\d{4}[-_.]\d{2}[-_.][0-3]\d(?!\d)'
string1='foo2012-09-2018-09-03foo'
for match in re.finditer(pattern, string1):
    i, j = match.span()
    print(string1[i:j])
    # WORK : 2018-09-03

string2='2012-09-2018-09-03'
for match in re.finditer(pattern, string2):
    i, j = match.span()
    print(string2[i:j])
    # WORK : 2018-09-03

重新导入
模式='（？多亏了joanis，答案是否定的前瞻：
import re
pattern  = '(?<!\d)\d{4}[-_.]\d{2}[-_.][0-3]\d(?!\d)'
string1='foo2012-09-2018-09-03foo'
for match in re.finditer(pattern, string1):
    i, j = match.span()
    print(string1[i:j])
    # WORK : 2018-09-03

string2='2012-09-2018-09-03'
for match in re.finditer(pattern, string2):
    i, j = match.span()
    print(string2[i:j])
    # WORK : 2018-09-03

重新导入
pattern='（？如果有帮助，请举个例子
import re

#using a list as output can then be looped for this example
strings = ['foo2012-09-2018-09-03foo', '2012-09-2018-09-03']

#Is there a way to adjust the pattern to say look for date followed by a non decimal digit or end of the string?
#Yes! :o) Use a non-capturing group for 'not a number or the end of the line' which is: (?:\D|$)
pattern = re.compile(r'(\d{4}-\d{2}-\d{2})(?:\D|$)')

for string in strings:
    print(pattern.search(string)[1])

for string in strings:
    print(pattern.findall(string))

产出：
2018-09-03
2018-09-03
['2018-09-03']
['2018-09-03']

一个例子，如果它有帮助
import re

#using a list as output can then be looped for this example
strings = ['foo2012-09-2018-09-03foo', '2012-09-2018-09-03']

#Is there a way to adjust the pattern to say look for date followed by a non decimal digit or end of the string?
#Yes! :o) Use a non-capturing group for 'not a number or the end of the line' which is: (?:\D|$)
pattern = re.compile(r'(\d{4}-\d{2}-\d{2})(?:\D|$)')

for string in strings:
    print(pattern.search(string)[1])

for string in strings:
    print(pattern.findall(string))

产出：
2018-09-03
2018-09-03
['2018-09-03']
['2018-09-03']

您正在寻找的是一个消极的前瞻：（？！[0-9]）
在您的模式末尾。\d
和\d
将不带括号使用。@joanis:Whou这是一个快速的答案。谢谢，我将查看否定前瞻…显然我不熟悉concept@JohnnyWezelIndead。Thx@joanis这就行了。谢谢！你要找的是消极的前瞻：（？）？！[0-9])
在您的模式末尾。\d
和\d
将不带括号使用。@joanis:Whou这是一个快速的答案。谢谢，我将查看否定前瞻…显然我不熟悉concept@JohnnyWezelIndead。Thx@joanis这就完成了。谢谢！非常感谢这个例子。使用负前瞻作为选项（（？！\d）
）比示例中建议的前瞻性断言（（？：\d$）
）要短。两者都在做这项工作，只是一个偏好问题。非常感谢您的示例。使用消极前瞻性断言（（？！\d）
）比示例中建议的前瞻性断言（（？：\d$）要短
）。两人都在做这项工作，只是偏好的问题。