BeautifulSoup和regexp:属性错误_Regex_Python 3.x_Beautifulsoup

BeautifulSoup和regexp:属性错误

regex python-3.x

BeautifulSoup和regexp:属性错误,regex,python-3.x,beautifulsoup,Regex,Python 3.x,Beautifulsoup,我尝试通过reg使用beautifulsoup4方法提取信息。经验。但我得到了以下答案： AttributeError:“非类型”对象没有属性“组” 我不明白怎么了。。我正在努力：获取输入法名称：“herenhuizen” 获取网络链接这是我的密码： import requests from bs4 import BeautifulSoup import re url = 'https://inventaris.onroerenderfgoed.be/erfgoedobjecten/47

我尝试通过reg使用beautifulsoup4方法提取信息。经验。但我得到了以下答案：

AttributeError:“非类型”对象没有属性“组”

我不明白怎么了。。我正在努力：

获取输入法名称：“herenhuizen”

获取网络链接

这是我的密码：

import requests
from bs4 import BeautifulSoup
import re

url = 'https://inventaris.onroerenderfgoed.be/erfgoedobjecten/4778'
page = requests.get(url)

soup = BeautifulSoup(page.text, 'html.parser')
text = soup.prettify()

##block
p = re.compile('(?s)(?<=(Typologie))(.*?)(?=(</a>))', re.VERBOSE)
block = p.search(text).group(2)


##typo_url
p = re.compile('(?s)(?<=(href=\"))(.*?)(?=(\">))', re.VERBOSE)
typo_url = p.search(block).group(2)


## typo_name
p = re.compile('\b(\w+)(\W*?)$', re.VERBOSE)
typo_name = p.search(block).group(1)

导入请求
从bs4导入BeautifulSoup
进口稀土
url='1〕https://inventaris.onroerenderfgoed.be/erfgoedobjecten/4778'
page=请求.get（url）
soup=BeautifulSoup（page.text，'html.parser'）
text=soup.prettify（）
##挡块
p=re.compile（'（？s）（？我会改变这一点：
## typo_name
block_reverse = block[::-1]
p = re.compile('(\w+)', re.VERBOSE)
typo_name_reverse = p.search(block_reverse).group(1)
typo_name = typo_name_reverse[::-1]
print(typo_name)

有时，如果您在末尾查找内容，只需反转字符串就更容易了。这只需在块的末尾查找名称。有多种方法可以找到您要查找的内容，我们可以想出各种聪明的正则表达式，但如果这样做行得通，可能就足够了：）
更新
然而，我刚刚注意到原始正则表达式不起作用的原因是使用\b
它需要像\\b
那样转义，或者像这样原始：
## typo_name
p = re.compile(r'\b(\w+)(\W*?)$', re.VERBOSE)
typo_name = p.search(block).group(1)

这里有一些很好的Q和A：
它工作得很好。谢谢！是的，重要的是它正在工作并得到我们想要的东西..但我仍然在想为什么打字块中的regexp不工作..@francois啊，我刚刚意识到了这一点。要使用\b
你需要这样做：p=re.compile（r'\b（\w+）（\w*？）$），re.VERBOSE）
要将其制作成原始的，或者您需要执行\\b
类似的问题，并在此处提供一些答案：