需要使用python解析php/html文件的帮助吗_Python_Html_Parsing

需要使用python解析php/html文件的帮助吗

python html parsing

需要使用python解析php/html文件的帮助吗,python,html,parsing,Python,Html,Parsing,我想保留url并将数据转储到熊猫数据框中列如马/日期/路线/死亡原因我尝试了pandas read_html直接读取此url，但它没有找到表，即使它有表标记我尝试使用： url='https://www.horsedeathwatch.com/index.php' #Create a handle, page, to handle the contents of the website page = requests.get(url) #print(page.text)

我想保留url并将数据转储到熊猫数据框中

列如马/日期/路线/死亡原因我尝试了pandas read_html直接读取此url，但它没有找到表，即使它有表标记

我尝试使用：

  url='https://www.horsedeathwatch.com/index.php'
  #Create a handle, page, to handle the contents of the website
  page = requests.get(url)
  #print(page.text)
  soup = BeautifulSoup(page.content,'lxml')

然后是findall（'tr'）方法，但由于某种原因无法使其工作

我想做的第二件事是。。每匹马（网页表中的第一列）都有一个带有附加属性的超链接

关于如何将这些附加属性检索到pandas数据框的任何建议查看站点，我可以看到数据是通过向

/loaddata.php

传递页码的POST请求加载的。结合pandas.read_html：

import requests
import pandas

res = requests.post('https://www.horsedeathwatch.com/loaddata.php', data={'page': '3'})
html = pandas.read_html(res.content)

虽然可能

BeautifulSoup

会给您提供更丰富的数据结构。。因为如果要针对每匹马提取更多属性，需要获取锚元素的“href”并执行另一个请求——这是一个get请求，需要解析响应中

的响应内容