Python 正则表达式提取selenium中带有数字的链接

Python 正则表达式提取selenium中带有数字的链接,python,regex,selenium,beautifulsoup,Python,Regex,Selenium,Beautifulsoup,以下是链接列表 <a class="table-link" href="/tasks/document/new">Should review </a></td> <a class="table-link" href="/tasks/document/58324">Should review </a></td> <td> <a class="table-link" href="/tasks/document/

以下是链接列表

<a class="table-link" href="/tasks/document/new">Should review
</a></td>
<a class="table-link" href="/tasks/document/58324">Should review
</a></td>
<td>
<a class="table-link" href="/tasks/document/58325">AFCO certificate
</a></td>
<td>
<a class="table-link" href="/tasks/document/58325">Document Task
</a></td>
<td>
<a class="table-link" href="/tasks/document/58326">Pending
</a></td>
<td>
<a class="table-link" href="/tasks/document/58327">Cami  ltd
</a></td>
<td>
<a class="table-link" href="/tasks/document/58328">29 Sep 14:57
我正在使用下面的代码
驱动程序。通过_css_选择器(“a[href*='/tasks/document/'])查找_元素


如何将其修改为仅读取数字?

selenium中没有此类选项


如果需要,可以使用selenium获取源代码并将其提供给解析器。然后,您可以使用regexp来查找所需的元素。

这可以使用BeautifulSoup来完成,如下所示:

 <a class="table-link" href="/tasks/document/58324">
    <a class="table-link" href="/tasks/document/58325">
    <a class="table-link" href="/tasks/document/58326">
    <a class="table-link" href="/tasks/document/58327">
    <a class="table-link" href="/tasks/document/58328">
html = """    
<a class="table-link" href="/tasks/document/new">Should review</a></td>
<a class="table-link" href="/tasks/document/58324">Should review/a></td>
<td>
<a class="table-link" href="/tasks/document/58325">AFCO certificate</a></td>
<td>
<a class="table-link" href="/tasks/document/58325">Document Task</a></td>
<td>
<a class="table-link" href="/tasks/document/58326">Pending</a></td>
<td>
<a class="table-link" href="/tasks/document/58327">Cami  ltd</a></td>
<td>
<a class="table-link" href="/tasks/document/58328">29 Sep 14:57"""

from bs4 import BeautifulSoup        
import re

soup = BeautifulSoup(html, "html.parser")

for a in soup.find_all('a', href=re.compile(r'\/tasks\/document\/\d+')):
    print a
给你:

/tasks/document/58324
/tasks/document/58325
/tasks/document/58325
/tasks/document/58326
/tasks/document/58327
/tasks/document/58328

您的代码尝试在哪里?请参阅:SO的期望是,用户提问时不仅要进行研究以回答自己的问题,还要分享研究、代码尝试和结果。这表明你花了时间来帮助自己,它使我们避免重复显而易见的答案,最重要的是,它帮助你得到一个更具体和相关的答案!另见:
print a['href']
/tasks/document/58324
/tasks/document/58325
/tasks/document/58325
/tasks/document/58326
/tasks/document/58327
/tasks/document/58328