Python使用关键字提取html网页内容
使用python需要通过匹配关键字来提取上下文 这是我的python脚本Python使用关键字提取html网页内容,python,html,python-3.x,Python,Html,Python 3.x,使用python需要通过匹配关键字来提取上下文 这是我的python脚本 import requests from bs4 import BeautifulSoup import re html = """ <pre> Companies: Telstra VI Huawei Countries: JPN CHN MLY </pre> <pre> Data ce
import requests
from bs4 import BeautifulSoup
import re
html = """ <pre>
Companies:
Telstra VI Huawei
Countries:
JPN CHN MLY
</pre>
<pre>
Data center:
US UK
</pre>"""
r = requests.get(html)
soup = BeautifulSoup(r.content, "html.parser")
k = soup.find(text=re.compile("companies:")).parent.text
print (k)
试试这个
你的问题是什么?您当前的输出是什么?@ThomasMunk请查看我使用的python脚本,我想打印预期的输出。当前输出为{}
Companies:
Telstra VI Huawei
from simplified_scrapy import SimplifiedDoc
html = """ <pre>
Companies:
Telstra VI Huawei
Countries:
JPN CHN MLY
</pre>
<pre>
Data center:
US UK
</pre>"""
doc = SimplifiedDoc(html)
pre = doc.getElementByReg('Companies:')
print(pre.text)
print('-' * 50)
print(pre.replaceReg('Countries:[\s\S]*', '').strip())
Companies: Telstra VI Huawei Countries: JPN CHN MLY
--------------------------------------------------
Companies:
Telstra VI Huawei