Python 熊猫数据框对擦伤的足球分数_Python_Pandas_Selenium_Beautifulsoup

Python 熊猫数据框对擦伤的足球分数

python pandas selenium

Python 熊猫数据框对擦伤的足球分数,python,pandas,selenium,beautifulsoup,Python,Pandas,Selenium,Beautifulsoup,Python和web抓取的新功能。我正试图通过BeautifulSoup（来自ESPN）将实时大学足球分数导入熊猫数据框。我已经搜索了很高和很低，似乎无法获得正确格式的导入分数一旦我将其放入数据框中，我将把结果导入Excel 这是我到目前为止所拥有的。结果列在所有团队的一列中，然后是所有分数 from bs4 import BeautifulSoup from selenium import webdriver import pandas as pd driver = webdriver

Python和web抓取的新功能。我正试图通过BeautifulSoup（来自ESPN）将实时大学足球分数导入熊猫数据框。我已经搜索了很高和很低，似乎无法获得正确格式的导入分数

一旦我将其放入数据框中，我将把结果导入Excel

这是我到目前为止所拥有的。结果列在所有团队的一列中，然后是所有分数


from bs4 import BeautifulSoup
from selenium import webdriver
import pandas as pd


driver = webdriver.Chrome(executable_path=r'C:\Users\Jims Maximus Hero\Desktop\chromedriver.exe')
driver.get("https://www.espn.com/college-football/scoreboard/_/group/80/year/2019/seasontype/2/week/11")

html = driver.page_source
soup = BeautifulSoup(html, "lxml")

for tag in soup.find_all("span", {"class":"sb-team-short"}):
    print (tag.text)

for tag in soup.find_all("td", {"class":"total"}):
    print (tag.text)

感谢您的帮助

试试这个：

driver.get('https://www.espn.com/college-football/scoreboard/_/group/80/year/2019/seasontype/2/week/11')

df  = pd.read_html(driver.find_element_by_xpath('//*[@id="401119297"]/div/div/section/div/table').get_attribute('outerHTML'))
results = driver.find_elements_by_xpath("//article[contains(@class, 'scoreboard football')]")
df = pd.DataFrame()
for result in results:
    score  = pd.read_html(driver.find_element_by_xpath('//*[@id="'+str(result.get_attribute('id'))+'"]/div/div/section/div/table').get_attribute('outerHTML'))
    score = score[0].dropna(axis=0, thresh=4)
    df = pd.concat([df,score])
print(df)

产生：

一旦你有了

页面_源代码

，你是否尝试过使用

pd.read\u html（）

并以那种方式提取你的表？您似乎还可以使用

{'class'：'scoreboard wrapper'}

以某些div为目标，谢谢您的帮助。我明天会修改并回复你。我在学习。