Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/311.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 使用正则表达式从文本中提取未知数量的字符串_Python_Regex - Fatal编程技术网

Python 使用正则表达式从文本中提取未知数量的字符串

Python 使用正则表达式从文本中提取未知数量的字符串,python,regex,Python,Regex,我使用正则表达式从文本中提取某些信息。例如,一个名字可以由几个名字和一个姓氏组成(数量未知)。以下示例提取了2个字符串: Name:\s+([\w-äöü]+\s[\w-äöü]+) 如何定义正则表达式以提取未知(!)数量的字符串,直到定义的下一个术语(例如“Address:”)?使用 Name:\s+([\wäöü-]+(?:\s+[\wäöü-]+)*?)(?=\s*Address) 看 解释 -----------------------------------------------

我使用正则表达式从文本中提取某些信息。例如,一个名字可以由几个名字和一个姓氏组成(数量未知)。以下示例提取了2个字符串:

Name:\s+([\w-äöü]+\s[\w-äöü]+)
如何定义正则表达式以提取未知(!)数量的字符串,直到定义的下一个术语(例如“Address:”)?

使用

Name:\s+([\wäöü-]+(?:\s+[\wäöü-]+)*?)(?=\s*Address)

解释

--------------------------------------------------------------------------------
  Name:                    'Name:'
--------------------------------------------------------------------------------
  \s+                      whitespace (\n, \r, \t, \f, and " ") (1 or
                           more times (matching the most amount
                           possible))
--------------------------------------------------------------------------------
  (                        group and capture to \1:
--------------------------------------------------------------------------------
    [\wäöü-]+                any character of: word characters (a-z,
                             A-Z, 0-9, _), 'ä', 'ö', 'ü', '-' (1 or
                             more times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
    (?:                      group, but do not capture (0 or more
                             times (matching the least amount
                             possible)):
--------------------------------------------------------------------------------
      \s+                      whitespace (\n, \r, \t, \f, and " ")
                               (1 or more times (matching the most
                               amount possible))
--------------------------------------------------------------------------------
      [\wäöü-]+                any character of: word characters (a-
                               z, A-Z, 0-9, _), 'ä', 'ö', 'ü', '-' (1
                               or more times (matching the most
                               amount possible))
--------------------------------------------------------------------------------
    )*?                      end of grouping
--------------------------------------------------------------------------------
  )                        end of \1
--------------------------------------------------------------------------------
  (?=                      look ahead to see if there is:
--------------------------------------------------------------------------------
    \s*                      whitespace (\n, \r, \t, \f, and " ") (0
                             or more times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
    Address                  'Address'
--------------------------------------------------------------------------------
  )                        end of look-ahead
请回答问题,并提供一些输入和预期匹配的示例。