如何在python中使用正则表达式搜索特定链接_Python_Web Scraping_Beautifulsoup

如何在python中使用正则表达式搜索特定链接

python web-scraping

如何在python中使用正则表达式搜索特定链接,python,web-scraping,beautifulsoup,Python,Web Scraping,Beautifulsoup,我正在使用python3.5.1和BeautifulSoup抓取一个网站我想使用正则表达式搜索特定链接：我的代码：我得到了所有类似的链接 ['/cn/ExpExhibitorList.aspx?categoryno=432', 'ExpExhibitorList.aspx?categoryno=432003'] 但是我不想要 '/cn/ExpExhibitorList.aspx?categoryno=432' 要进行搜索，只需在正则表达式中使用锚 links = soup.find_a

我正在使用python3.5.1和BeautifulSoup抓取一个网站我想使用正则表达式搜索特定链接：我的代码：

我得到了所有类似的链接

['/cn/ExpExhibitorList.aspx?categoryno=432', 'ExpExhibitorList.aspx?categoryno=432003']

但是我不想要

'/cn/ExpExhibitorList.aspx?categoryno=432'

要进行搜索，只需在正则表达式中使用锚

links = soup.find_all("a", href=re.compile(r"^ExpExhibitorList\.aspx\?categoryno=[0-9]+$"))

这将匹配所有的

标记，这些标记具有与上述正则表达式匹配的精确值。

为什么不需要该链接？它匹配你的正则表达式，所以你会得到它。请多解释

links = soup.find_all("a", href=re.compile(r"^ExpExhibitorList\.aspx\?categoryno=[0-9]+$"))