Python 如何使用beautifulsoup为html嵌套标记定义findAll_Python_Html_Beautifulsoup

Python 如何使用beautifulsoup为html嵌套标记定义findAll

python html

Python 如何使用beautifulsoup为html嵌套标记定义findAll,python,html,beautifulsoup,Python,Html,Beautifulsoup,给定但我不知道如何进一步定义这个发现语句。我还尝试了其他方法，将findAll（）的结果转换为数组。然后寻找针头何时出现的模式，但我找不到一致的模式。谢谢infoText是一个列表。你应该迭代一下 soup = util.mysoupopen(theexample) infoText = soup.findAll("table", {"class": "the class"}) >有关infoText中的信息： >>>打印信息.tr.td.a 然后可以访问元素。如果您只希望文档中有一个

给定

但我不知道如何进一步定义这个发现语句。我还尝试了其他方法，将findAll（）的结果转换为数组。然后寻找针头何时出现的模式，但我找不到一致的模式。

谢谢

infoText是一个列表。你应该迭代一下

soup = util.mysoupopen(theexample) 
infoText = soup.findAll("table", {"class": "the class"})

>有关infoText中的信息：
>>>打印信息.tr.td.a

然后可以访问

元素。如果您只希望文档中有一个表元素带有类“theclass”，那么

soup.find（“table”，{“class”：“theclass”}）

将直接提供该表。

如果我理解您的问题。这就是应该可以工作的python代码。使用class=“theclass”迭代查找所有表，然后查找其中的链接

>>>for info in infoText:
>>>    print info.tr.td.a
<a href="www.example.com/two">two</a>

>foo=”“”
... 
... 
... 
... 
... 
... 
……废话
... 
... 
... """
>>>以bs形式导入BeautifulSoup
>>>汤=bs.BeautifulSoup（foo）
>>>对于soup.findAll（'table'，{'class'：'theclass'}）中的表：
...     links=table.findAll（'a'）
... 
>>>打印链接
[, ]

我遇到了这个错误，我不知道为什么会这样<代码>回溯（最后一次调用）：文件“test.py”，第10行，在print info.tr.td.a文件“/nfs/home/j/d/jdiaz/cs171/BeautifulSoup.py”，第402行，在“getattr\uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuAttributeError:“NavigableString”对象没有属性“tr”您想放弃什么？你说“我怎样才能只刮表class=”类“？你是说链接吗？

>>>for info in infoText:
>>>    print info.tr.td.a
<a href="www.example.com/two">two</a>

>>> foo = """<a href="www.example.com/"></a>
... <table class="theclass">
... <tr><td>
... <a href="www.example.com/two">two</a>
... </td></tr>
... <tr><td>
... <a href ="www.example.com/three">three</a>
... <span>blabla<span>
... </td></td>
... </table>
... """
>>> import BeautifulSoup as bs
>>> soup = bs.BeautifulSoup(foo)
>>> for table in soup.findAll('table', {'class':'theclass'} ):
...     links=table.findAll('a')
... 
>>> print links
[<a href="www.example.com/two">two</a>, <a href="www.example.com/three">three</a>]