Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/joomla/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 使用BeatifulSoup将HTML标记中的所有日期放入列表_Python - Fatal编程技术网

Python 使用BeatifulSoup将HTML标记中的所有日期放入列表

Python 使用BeatifulSoup将HTML标记中的所有日期放入列表,python,Python,这是我的HTML文件: [<small class="breadcrumb x-normal"> <span><i data-icon="clock"></i>Today 10:52</span> </small>] [<small class="breadcrumb x-normal"> <span><i data-icon="clock"></i>April 11</

这是我的HTML文件:

[<small class="breadcrumb x-normal">
<span><i data-icon="clock"></i>Today 10:52</span>
</small>]
[<small class="breadcrumb x-normal">
<span><i data-icon="clock"></i>April 11</span>
</small>]
[<small class="breadcrumb x-normal">
<span><i data-icon="clock"></i>April 5</span>
</small>]
<span><i data-icon="clock"></i>February 29</span>
</small>]
范例

从bs4导入BeautifulSoup html=‘今天10:52’\ “4月11日”\ “4月5日”\ “2月29日” soup=BeautifulSouphtml,“html.parser” data=[item.next_元素,用于soup.findAll中的项目 i、 {'data-icon':'clock'}] 打印数据 输出:

[‘今天10:52’、‘4月11日’、‘4月5日’、‘2月29日’]
有什么问题吗?看来你忘了问问题了?看,它不是这样工作的。这远远不够,请看。这是我的HTML文件:它看起来不像是有效的HTML?
  from bs4 import BeautifulSoup
    import lxml

    def get_dates(html):
        soup = BeautifulSoup(html, 'lxml')
            dates = soup.pass
            print (date)

  get_dates(html.text)
from bs4 import BeautifulSoup

html = '<small class="breadcrumb x-normal"><span><i data-icon="clock"></i>Today 10:52</span></small>' \
       '<small class="breadcrumb x-normal"><span><i data-icon="clock"></i>April 11</span></small>' \
       '<small class="breadcrumb x-normal"><span><i data-icon="clock"></i>April 5</span></small>' \
       '<small class="breadcrumb x-normal"><span><i data-icon="clock"></i>February 29</span></small>'

soup = BeautifulSoup(html, features="lxml")
date_list = []
dates = soup.find_all('small', {'class':'breadcrumb x-normal'})

for date in dates:
    print(date.text)
    date_list.append(date.text)


print(date_list)