Python flashcore.com中的网页抓取
我正试图从Flashcore中刮取 当我尝试运行它时,我得到一个超出范围的列表索引 当Python flashcore.com中的网页抓取,python,html,python-requests,Python,Html,Python Requests,我正试图从Flashcore中刮取 当我尝试运行它时,我得到一个超出范围的列表索引 当set1不存在时,是否可以添加零 有人能帮我吗 from requests_html import AsyncHTMLSession from collections import defaultdict import pandas as pd url = 'https://www.flashscore.com/table-tennis/others-men/tt-star-series/results/
set1
不存在时,是否可以添加零
有人能帮我吗
from requests_html import AsyncHTMLSession
from collections import defaultdict
import pandas as pd
url = 'https://www.flashscore.com/table-tennis/others-men/tt-star-series/results/'
asession = AsyncHTMLSession()
async def get_scores():
r = await asession.get(url)
await r.html.arender()
return r
results = asession.run(get_scores)
results = results[0]
times = results.html.find("div.event__time")
home_players = results.html.find("div.event__participant.event__participant--home")
scores_home = results.html.find('div > div.event__score.event__score--home')
scores_home_set1 = results.html.find('div > div.event__part.event__part--home.event__part--1')
dict_res = defaultdict(list)
for i in range(len(times)):
dict_res['times'].append(times[i].text)
dict_res['home_players'].append(home_players[i].text)
dict_res['scores_home'].append(scores_home[i].text)
dict_res['set1'].append(scores_home_set1[i].text)
df_res = pd.DataFrame(dict_res)
df_res.to_csv('ttstar.csv', index=False)
我在这篇文章中有一个答案。这是一种不同的方法来达到同样的效果。非常感谢您@AlbertoCastillo
from selenium import webdriver
driver = webdriver.Firefox(executable_path='/usr/local/bin/geckodriver')
url = 'https://www.flashscore.com/table-tennis/others-men/tt-star-series/results/'
driver.get(url)
data = driver.find_elements_by_class_name('event__match')
data_clean = []
for i in range(len(data)):
data_clean.append(data[i].text)
data_clean = [x.replace("\n", "|") for x in data_clean]
data_clean