Python Beautifulsoup获取特定儿童的内容
我目前正在与Beautifulsoup合作开发一个爬虫程序。我想获得无序列表中特定子级的数据 所以这个网页基本上是这样的:Python Beautifulsoup获取特定儿童的内容,python,beautifulsoup,web-crawler,selector,children,Python,Beautifulsoup,Web Crawler,Selector,Children,我目前正在与Beautifulsoup合作开发一个爬虫程序。我想获得无序列表中特定子级的数据 所以这个网页基本上是这样的: <div class= product-list-item--usp-list> <ul class="unordered-list"> <li>a</li> <li>b</li> <li>c</li> &
<div class= product-list-item--usp-list>
<ul class="unordered-list">
<li>a</li>
<li>b</li>
<li>c</li>
</ul>
a = item.find("ul", class_="unordered-list").li
b = item.find("ul", class_="unordered-list").li
所以我试了一下:
a=item.findul,class=无序列表。li[1]
b=item.findul,class=无序列表。li[2]
这是我的错误:
a = item.find("ul", class_="unordered-list").li[1]
File "/usr/local/lib/python2.7/dist-packages/bs4/element.py", line 905, in __getitem__
return self.attrs[key]
KeyError: 1
[Finished in 2.9s with exit code 1]
我的问题是:我如何接收child[1]和child[2]的内容?
提前谢谢 你可以像下面这样做
>>> from bs4 import BeautifulSoup
>>> s = """<div class= product-list-item--usp-list>
<ul class="unordered-list">
<li>a</li>
<li>b</li>
<li>c</li>
</ul> """
>>> soup = BeautifulSoup(s)
>>> foo = soup.find("ul", class_="unordered-list")
>>> [i.text for i in foo.find_all('li')[1:]]
['b', 'c']
@henk[1:]列表切片。所以它从第二个元素开始。