Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/regex/19.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Regex 如何使正则表达式非贪婪地提取特定元素_Regex_Python 3.x - Fatal编程技术网

Regex 如何使正则表达式非贪婪地提取特定元素

Regex 如何使正则表达式非贪婪地提取特定元素,regex,python-3.x,Regex,Python 3.x,我需要从以下文本中提取某些短语: Restricted Cash 951 37505 Accounts Receivable - Affiliate 31613 27539 Accounts Receivable - Third Party 23091 2641 Crude Oil Inventory 2200 0 Other Current Assets 2724 389 Total Current Assets 71319 86100 Property Plant and Equipm

我需要从以下文本中提取某些短语:

Restricted Cash 951 37505 Accounts Receivable - Affiliate 31613 27539 Accounts
 Receivable - Third Party 23091 2641 Crude Oil Inventory 2200 0 Other Current
 Assets 2724 389 
Total Current Assets 71319 86100 Property Plant and Equipment Total Property 
Plant and Equipment Gross 1500609 706039 Less Accumulated 
Depreciation and Amortization (79357) (44271) Total Property Plant and Equipment
 Net 1421252 661768 Intangible Assets Net 310202 0 Goodwill 109734 0 Investments
 82317 80461 Other Noncurrent Assets 3093 1429 Total Assets 1997917 829758 
LIABILITIES Current Liabilities Accounts Payable - Affiliate 2778 1616 Accounts
 Payable - Trade 92756 109893 Other Current Liabilities 9217 2876 Total Current
 Liabilities 104751 114385 Long-Term Liabilities Long-Term Debt 559021 85000
 Asset Retirement Obligations 17330 10416 Other Long-Term Liabilities 582 3727 
Total Liabilities 681684 213528 EQUITY Partners' Equity Limited Partner 
Common Units (23759 and 23712 units outstanding respectively) 699866 642616
 Subordinated Units (15903 units outstanding) (130207) (168136) General Partner 2421 520 
Total Partners' Equity 572080 475000 Noncontrolling Interests 744153 141230 Total 
Equity 1316233 616230 Total Liabilities and Equity 1997917 829758
我需要删除括号中的所有短语,即(),以及包含单词“未完成”或“单位”的数字

基于这些条件,我有两个短语需要删除:

  • (分别有23759台和23712台未偿付)
  • (15903台未售出)
  • 我在Python中尝试了以下正则表达式:

    \(\d+.+?(outstanding)+?\)
    
    想法是
    +?
    之后的
    \d+
    将使正则表达式不贪婪(懒惰)。然而,regex选择了从
    (79357)(44271)整个房地产厂房和设备
    未完成
    的巨大细分市场,这是贪婪的

    这里唯一的标记是单词
    杰出
    ,是否有更好的方法来提取这些短语?

    您可以使用

    \(\d[^()]*outstanding[^()]*\)
    
    请参阅和:

    详细信息

    • \(
      -
      char
    • \d
      -一个数字
    • [^()]*
      -0+字符,而不是
    • 未完成
      -子字符串
    • [^()]*
      -0+字符,而不是
    • \)
      -a
      字符
    Python:

    re.findall(r'\(\d[^()]*outstanding[^()]*\)', s)
    

    它在PHP中工作,而不是在Python中。我不知道如何发送到regex101.com的链接,非常感谢!它起作用了。你能解释一下吗?我不是100%地遵循逻辑。@user3151858你需要澄清哪一部分?我加上了口头解释。我刚刚添加了一个可视化。关键是它匹配
    ,一个数字,然后是括号内的任何文本,然后是某个单词,然后是括号内的任何文本,然后是
    这两个都让我感到困惑:[^()],*很清楚。@Wiktor Stribiżew-我非常感谢你的帮助@user3151858
    [^…]
    是一个否定字符类,它与其中指定字符以外的任何字符匹配。看见