Web scraping 是否有任何方法可以使用BeautifulSoup计算表标记的数量？_Web Scraping_Beautifulsoup

Web scraping 是否有任何方法可以使用BeautifulSoup计算表标记的数量？

web-scraping

Web scraping 是否有任何方法可以使用BeautifulSoup计算表标记的数量？,web-scraping,beautifulsoup,Web Scraping,Beautifulsoup,我正试图通过维基百科网站显示表格，但表格编号将由用户指定所以我想，从页面中获取表标记的数量，循环并显示用户指定的数量，然后显示表到目前为止，我只能显示表中的内容 from bs4 import BeautifulSoup import urllib2 from lxml.html import fromstring import re wiki = "http://en.wikipedia.org/wiki/List_of_Test_cricket_records" header =

我正试图通过维基百科网站显示表格，但表格编号将由用户指定

所以我想，从页面中获取表标记的数量，循环并显示用户指定的数量，然后显示表

到目前为止，我只能显示表中的内容

from bs4 import BeautifulSoup
import urllib2
from lxml.html import fromstring


import re

wiki = "http://en.wikipedia.org/wiki/List_of_Test_cricket_records"

header = {'User-Agent': 'Mozilla/5.0'} #Needed to prevent 403 error on Wikipedia

req = urllib2.Request(wiki,headers=header)

page = urllib2.urlopen(req)

soup = BeautifulSoup(page)



table = soup.findAll("table")


for row in table :

    td=row.findAll("tr")

    for data in td :

        cells = data.findAll("td")

有更好的办法吗。请指导我。

在您命名的表列表中，您已经拥有了所有50个表。在第一个for循环中，您将迭代这些表，因此最好在table:中键入table_标记，而不是在table:中键入row。那么问题是什么呢？是的。它可以运行，但我正在寻找一种方法来获取用户指定的表数。将其命名为table_标记是个好主意。谢谢。Maria，在您自己的解决方案被删除后，您将获得errorAttributeError:“NavigableString”对象没有“find_all”属性，因为您正在尝试迭代bs4.element.Tag。不再需要表中的行，因为键入table=soup.findAlltable[2]后，您正好有一个表。只需删除外部for循环，我就找到了一个函数来完成所需的操作。谢谢你，奥利弗。我意识到我的错误，并删除了解决方案和回溯错误。感谢您及时的回复。