Python 使用BeatifulSoup将HTML标记中的所有日期放入列表
这是我的HTML文件:Python 使用BeatifulSoup将HTML标记中的所有日期放入列表,python,Python,这是我的HTML文件: [<small class="breadcrumb x-normal"> <span><i data-icon="clock"></i>Today 10:52</span> </small>] [<small class="breadcrumb x-normal"> <span><i data-icon="clock"></i>April 11</
[<small class="breadcrumb x-normal">
<span><i data-icon="clock"></i>Today 10:52</span>
</small>]
[<small class="breadcrumb x-normal">
<span><i data-icon="clock"></i>April 11</span>
</small>]
[<small class="breadcrumb x-normal">
<span><i data-icon="clock"></i>April 5</span>
</small>]
<span><i data-icon="clock"></i>February 29</span>
</small>]
范例
从bs4导入BeautifulSoup
html=‘今天10:52’\
“4月11日”\
“4月5日”\
“2月29日”
soup=BeautifulSouphtml,“html.parser”
data=[item.next_元素,用于soup.findAll中的项目
i、 {'data-icon':'clock'}]
打印数据
输出:
[‘今天10:52’、‘4月11日’、‘4月5日’、‘2月29日’]
有什么问题吗?看来你忘了问问题了?看,它不是这样工作的。这远远不够,请看。这是我的HTML文件:它看起来不像是有效的HTML?
from bs4 import BeautifulSoup
import lxml
def get_dates(html):
soup = BeautifulSoup(html, 'lxml')
dates = soup.pass
print (date)
get_dates(html.text)
from bs4 import BeautifulSoup
html = '<small class="breadcrumb x-normal"><span><i data-icon="clock"></i>Today 10:52</span></small>' \
'<small class="breadcrumb x-normal"><span><i data-icon="clock"></i>April 11</span></small>' \
'<small class="breadcrumb x-normal"><span><i data-icon="clock"></i>April 5</span></small>' \
'<small class="breadcrumb x-normal"><span><i data-icon="clock"></i>February 29</span></small>'
soup = BeautifulSoup(html, features="lxml")
date_list = []
dates = soup.find_all('small', {'class':'breadcrumb x-normal'})
for date in dates:
print(date.text)
date_list.append(date.text)
print(date_list)