Python 解析<;ul>;用漂亮的汤做标签

Python 解析<;ul>;用漂亮的汤做标签,python,html,parsing,tags,beautifulsoup,Python,Html,Parsing,Tags,Beautifulsoup,考虑以下代码: divTag = soup.find_all("div", {"class":"classname"}) print divTag for tag in divTag: ulTag = soup.find_all("ul", {"class":"classname"}) print ulTag for tag in ulTag: liTag = soup.find_all("li", {"class":"classname"})

考虑以下代码:

divTag = soup.find_all("div", {"class":"classname"})
print divTag
for tag in divTag:
    ulTag = soup.find_all("ul", {"class":"classname"})
    print ulTag
    for tag in ulTag:
        liTag = soup.find_all("li", {"class":"classname"})
        print liTag
        for tag in liTag:
            diTag = soup.find_all("div", {"class":"classname"})
            print diTag
            for tag in diTag:
                aTags = tag.find_next("a")
                value = aTags.string
                print value
它只打印“divTag”和“ulTag”。我确信所有的类名都是对的。“ul”标签内约有7个“li”标签,但不打印任何“li”标签。请帮忙。提前谢谢

更新:

<div class="classname">
<ul auto-load="true" class="classname" data-href="">
<li class="classname">
<div class="classname"><a href="">"value"</a>  string <a href="">string1</a> <a class="muted"><abbr class="timeago" title=" 1 Jun, 2015, 10:23 am">7 hours ago</abbr></a>
</div>
</li>
<li>
</li>
</ul>
</div>

  • 7小时前

基本上,我想提取'a'标记中的“string”值。

每次在soup中搜索时,都会在此处提取。所以你失败了。您应该在标记的父标记中搜索标记。 试着这样做:

divTag = soup.find_all("div", {"class":"classname"})
for ulTag in divTag:
    for liTag in ulTag.find_all("li", {"class":"classname"}):
        for tag in liTag.find_all("div", {"class":"classname"}):
            for aTag in tag.find_all('a'):
                print aTag.string
对于您提供的html,输出为:

"value"
string1
7 hours ago

具有下一个兄弟姐妹的完整解决方案

ulTag = soup.find("ul", {"class": "classname"})
aTags = ulTag.find_all("a")
for aTag in aTags:
    sibling = aTag.next_sibling
    siblingString = str(sibling).strip()
    if len(siblingString) > 0:
        print siblingString 

请回答您的问题,并附上您想要解析的HTML示例。另外,请遵循常见的Python约定,在每一级缩进中使用四个空格。我已经添加了HTML代码,请仔细查看。您想要每个
a
标记的值,还是前两个
a
之间的值
string
?前两个'a'之间的字符串值@MichaelI收到以下错误:AttributeError:'ResultSet'对象没有属性'findall'错误已消失,但未打印任何内容..:(