Python 查找所有div标记Find class=";后-“;后面跟着一些数字?
我想找到class=“post some number some text”的所有div标记 有多个div标签,例如Python 查找所有div标记Find class=";后-“;后面跟着一些数字?,python,regex,beautifulsoup,Python,Regex,Beautifulsoup,我想找到class=“post some number some text”的所有div标记 有多个div标签,例如 <div class="post-3562 some text"> <div class="post-some text"> <div class="post-some text"> <div class="post-1324 some text"> <div class="post-4540 some text"> &
<div class="post-3562 some text">
<div class="post-some text">
<div class="post-some text">
<div class="post-1324 some text">
<div class="post-4540 some text">
<div class="post-some text">
<div class="post-1122 some text">
有没有办法实现我想做的事情?可能使用正则表达式会有帮助吗?
任何帮助都将不胜感激。以下正则表达式将与您的测试用例相匹配:
/<div +class= *"post-\d+.*>/g
/您可以将正则表达式作为类
参数的值传入,如下所示:
soup.find_all(name='div', class_=re.compile(r'^post-\d+$'))
完整程序:
from bs4 import BeautifulSoup
import re
soup = BeautifulSoup('''
<root>
<div class="post-3562 some text"/>
<xdiv class="post-9999 some text"/>
<div class="post-some text"/>
<div class="post-some text"/>
<div class="post-1324some text"/>
<div class="some post-4540 text"/>
<div class="post-some text"/>
<div class="some text post-1122"/>
</root>''', 'html.parser')
for div in soup.find_all(name='div', class_=re.compile(r'^post-\d+$')):
print div
从bs4导入美化组
进口稀土
汤=美汤(“”)
'','',html.parser')
查找所有(name='div',class='re.compile(r'^post-\d+$):
印刷部
结果:
<div class="post-3562 some text"></div>
<div class="some post-4540 text"></div>
<div class="some text post-1122"></div>
(?
<div class="post-3562 some text"></div>
<div class="some post-4540 text"></div>
<div class="some text post-1122"></div>