Python Regex.search with grouping不是收集组_Python_Regex Group

Python Regex.search with grouping不是收集组

python

Python Regex.search with grouping不是收集组,python,regex-group,Python,Regex Group,我正在尝试搜索以下列表 /for_sale/44.97501,46.22024,-124.82303,-123.01166_xy/0-150000_price/LOT%7CLAND_type/9_zm/2_p/ /for_sale/44.97501,46.22024,-124.82303,-123.01166_xy/0-150000_price/LOT%7CLAND_type/9_zm/3_p/ /for_sale/44.97501,46.22024,-124.82303,-123.01166_

我正在尝试搜索以下列表

/for_sale/44.97501,46.22024,-124.82303,-123.01166_xy/0-150000_price/LOT%7CLAND_type/9_zm/2_p/
/for_sale/44.97501,46.22024,-124.82303,-123.01166_xy/0-150000_price/LOT%7CLAND_type/9_zm/3_p/
/for_sale/44.97501,46.22024,-124.82303,-123.01166_xy/0-150000_price/LOT%7CLAND_type/9_zm/6_p/
/for_sale/44.97501,46.22024,-124.82303,-123.01166_xy/0-150000_price/LOT%7CLAND_type/9_zm/7_p/
/for_sale/44.97501,46.22024,-124.82303,-123.01166_xy/0-150000_price/LOT%7CLAND_type/9_zm/8_p/
/for_sale/44.97501,46.22024,-124.82303,-123.01166_xy/0-150000_price/LOT%7CLAND_type/9_zm/2_p/

使用此代码：

next_page = re.compile(r'/(\d+)_p/$')
matches = list(filter(next_page.search, href_search)) #search or .match

for match in matches:
    #refining_nextpage = re.compile()
    print(match.group())

我收到以下错误：

AttributeError:'str'对象没有属性'group'

我认为

\d+

周围的括号会将一个或多个数字分组。我的目标是在字符串末尾获取

“\u p/”

前面的数字。

您可以尝试以下方法：

import re

# add re.M to match the end of each line
next_page = re.compile(r'/(\d+)_p/$',  re.M)
matches = next_page.findall(href_search)
print(matches)

它给出：

['2', '3', '6', '7', '8', '2']

您正在筛选原始列表，因此返回的是原始字符串，而不是匹配对象。如果要返回匹配对象，需要将搜索映射到列表，然后过滤匹配对象。例如：

next_page=re.compile（r'/（\d+）\u p/$）
matches=filter（lambda m:m不是None，map（next_page.search，href_search））
对于匹配中的匹配：
#细化\u nextpage=re.compile（）
打印（match.group（））

输出：

/2_p/
/3_p/
/6_p/
/7_p/
/8_p/
/2_p/

如果您只需要匹配的数字部分，请使用

match.group（1）

而不是

match.group（）

我认为应该这样做：

next_page.findall(href_search)  # ['2', '3', '6', '7', '8', '2']

>>> example = ["abc", "def45", "67ghi", "123"]
>>> my_match = re.compile(r"\d+$")
>>> [my_match.search(line) for line in example]  # Get the matches
[None,
 <re.Match object; span=(3, 5), match='45'>,
 None,
 <re.Match object; span=(0, 3), match='123'>]
>>> [match.group() for match in [my_match.search(line) for line in example] if match is not None]  # Filter None values
['45', '123']

或者，您可以拆分这些行，然后分别进行搜索：

matches = []
for line in href_search.splitlines():
    match = next_page.search(line)
    if match:
        matches.append(match.group(1))

matches  # ['2', '3', '6', '7', '8', '2']

filter

函数将只删除与正则表达式不匹配的行，并返回字符串，例如：

>>> example = ["abc", "def", "ghi", "123"]
>>> my_match = re.compile(r"\d+$")
>>> list(filter(my_match.search, example))
['123']

如果希望

匹配

对象，则列表理解可以做到这一点：

next_page.findall(href_search)  # ['2', '3', '6', '7', '8', '2']

>>> example = ["abc", "def45", "67ghi", "123"]
>>> my_match = re.compile(r"\d+$")
>>> [my_match.search(line) for line in example]  # Get the matches
[None,
 <re.Match object; span=(3, 5), match='45'>,
 None,
 <re.Match object; span=(0, 3), match='123'>]
>>> [match.group() for match in [my_match.search(line) for line in example] if match is not None]  # Filter None values
['45', '123']

示例=[“abc”、“def45”、“67ghi”、“123”] >>>my_match=re.compile（r“\d+$”） >>>[my_match.search（line）for line in example]#获取匹配项 [无， , 没有一个 ] >>>[my_match.search（line）for line in[my_match.group（）for match in[my_match.search（line）for line in example]如果match不是None]#筛选None值 ['45', '123']

您可以执行正则表达式（？您是否试图在/2\u p/
中获取值？如果需要，您需要添加r'\/（\d+）\u p\/$），但是，它将提取出/2\u p/
。如果您只想选择数字2，请给出（？）？
['2', '3', '6', '7', '8', '2']