Python 3.x 漂亮的汤，需要帮忙刮桌子在同一页的不同部分_Python 3.x_Web Scraping_Beautifulsoup

Python 3.x 漂亮的汤，需要帮忙刮桌子在同一页的不同部分

python-3.x web-scraping

Python 3.x 漂亮的汤，需要帮忙刮桌子在同一页的不同部分,python-3.x,web-scraping,beautifulsoup,Python 3.x,Web Scraping,Beautifulsoup,我试图刮一个网站，其中有多个表在同一页的不同部分 import requests from bs4 import BeautifulSoup url = "https://www.predictit.org/Contract/5367/Will-Donald-Trump-be-president-at-year-end-2018#prices" r = requests.get(url) data = r.text soup = BeautifulSoup(data,"html.pars

我试图刮一个网站，其中有多个表在同一页的不同部分

import requests
from bs4 import  BeautifulSoup


url = "https://www.predictit.org/Contract/5367/Will-Donald-Trump-be-president-at-year-end-2018#prices"

r = requests.get(url)
data = r.text
soup = BeautifulSoup(data,"html.parser")

table_body = soup.find('tbody')
rows = table_body.find_all('tr')
for row in rows:
    cols=row.find_all('td')
    cols=[x.text.strip() for x in cols]
    print(cols)

此部分有多个页面，每个部分有不同的表。我正在尝试刮取“#prices”部分中的价格数据，但我在URL中指定了它，但BeautifulSoup默认为第一部分“#data”中的表。有什么方法可以导航到我想要的部分吗？

在这种情况下，您需要向下面的URL发送请求，以获取您希望解析的价格。您可以使用devtools获取该url

import requests
from bs4 import  BeautifulSoup

url = "https://www.predictit.org/PrivateData/GetPriceListAjax?contractId=5367"

res = requests.get(url)
soup = BeautifulSoup(res.text,"html.parser")
for row in soup.select('table tr')[1:]:
    cols = [x.text.strip() for x in row.select('td')]
    print(cols)

输出：

['Price', 'Shares', '', 'Price', 'Shares']
['81¢', '289', '', '80¢', '2192']
['82¢', '7936', '', '79¢', '5478']
['83¢', '12800', '', '78¢', '6189']
['84¢', '8846', '', '77¢', '6167']
['85¢', '7726', '', '76¢', '2334']
['86¢', '7247', '', '75¢', '3268']
['87¢', '5562', '', '74¢', '2425']
['88¢', '4988', '', '73¢', '1390']
['89¢', '2889', '', '72¢', '3836']
['90¢', '4143', '', '71¢', '944']

如果您遇到任何问题，请尝试告诉我。谢谢。你可能想退房。当数据文件不能在网上公开时（在本例中，它是公开的），Selenium将提供帮助。