Python 使用正则表达式从文本中提取未知数量的字符串
我使用正则表达式从文本中提取某些信息。例如,一个名字可以由几个名字和一个姓氏组成(数量未知)。以下示例提取了2个字符串:Python 使用正则表达式从文本中提取未知数量的字符串,python,regex,Python,Regex,我使用正则表达式从文本中提取某些信息。例如,一个名字可以由几个名字和一个姓氏组成(数量未知)。以下示例提取了2个字符串: Name:\s+([\w-äöü]+\s[\w-äöü]+) 如何定义正则表达式以提取未知(!)数量的字符串,直到定义的下一个术语(例如“Address:”)?使用 Name:\s+([\wäöü-]+(?:\s+[\wäöü-]+)*?)(?=\s*Address) 看 解释 -----------------------------------------------
Name:\s+([\w-äöü]+\s[\w-äöü]+)
如何定义正则表达式以提取未知(!)数量的字符串,直到定义的下一个术语(例如“Address:”)?使用
Name:\s+([\wäöü-]+(?:\s+[\wäöü-]+)*?)(?=\s*Address)
看
解释
--------------------------------------------------------------------------------
Name: 'Name:'
--------------------------------------------------------------------------------
\s+ whitespace (\n, \r, \t, \f, and " ") (1 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
( group and capture to \1:
--------------------------------------------------------------------------------
[\wäöü-]+ any character of: word characters (a-z,
A-Z, 0-9, _), 'ä', 'ö', 'ü', '-' (1 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
(?: group, but do not capture (0 or more
times (matching the least amount
possible)):
--------------------------------------------------------------------------------
\s+ whitespace (\n, \r, \t, \f, and " ")
(1 or more times (matching the most
amount possible))
--------------------------------------------------------------------------------
[\wäöü-]+ any character of: word characters (a-
z, A-Z, 0-9, _), 'ä', 'ö', 'ü', '-' (1
or more times (matching the most
amount possible))
--------------------------------------------------------------------------------
)*? end of grouping
--------------------------------------------------------------------------------
) end of \1
--------------------------------------------------------------------------------
(?= look ahead to see if there is:
--------------------------------------------------------------------------------
\s* whitespace (\n, \r, \t, \f, and " ") (0
or more times (matching the most amount
possible))
--------------------------------------------------------------------------------
Address 'Address'
--------------------------------------------------------------------------------
) end of look-ahead
请回答问题,并提供一些输入和预期匹配的示例。