Python 碎片与靓汤-萃取<；第节>；_Python_Beautifulsoup_Splinter

Python 碎片与靓汤-萃取<；第节>；

python

Python 碎片与靓汤-萃取<；第节>；,python,beautifulsoup,splinter,Python,Beautifulsoup,Splinter,我正在尝试使用splinter和Beautiful Soup组合，从一场足球比赛中获取所有数据这是匹配的url: https://www.premierleague.com/match/46862 然后我点击“统计”选项卡并检查它。这是我到目前为止使用的代码，用于刮取第一部分“Head-to-Head”，它有一个和类

我正在尝试使用

splinter

和

Beautiful Soup

组合，从一场足球比赛中获取所有数据

这是匹配的url:

https://www.premierleague.com/match/46862

然后我点击“统计”选项卡并检查它。这是我到目前为止使用的代码，用于刮取第一部分“Head-to-Head”，它有一个

和

类


browser = Browser("chrome", **executable_path, headless=False)
# get page for a given match
match_url = 'https://www.premierleague.com/match/46862'
browser.visit(match_url)

match_tabs = 'li[class="matchCentreSquadLabelContainer"]' # <---- after inspection
browser.find_by_tag(match_tabs).click()
# sort the HTML
match = BeautifulSoup(browser.html, 'html.parser')
# get all stats from game
stats_tab = match.findAll('div', class_='mcTabs')

# HEAD_TO_HEAD
head_to_head = stats_tab[0].findAll('div', class_='statDetails')
# team at left side
team_left = stats_tab[0].find('div', class_='headToHeadTable headToHeadTableLeft')
team_left_name = team_left.find('a', class_='team').text
team_left_stat_names = team_left.findAll('div', class_='stat')
team_left_stat_counts = team_left.findAll('div', class_='count')
# team at right side
team_right = stats_tab[0].find('div', class_='headToHeadTable headToHeadTableRight')
team_right_name = team_right.find('a', class_='team').text
team_right_stat_names = team_right.findAll('div', class_='stat')
team_right_stat_counts = team_right.findAll('div', class_='count')

但是还有更多的内容，如下面的HTML图片所示

我还想从上面的“最近的会议”栏、“表格指南”、“赛季至今”和“顶级球员统计”栏中摘取，每个栏都有自己的“包装栏col-12”
类
但是我没有设法在每个
中刮取
，因为每个
都有一个不同的
，这与我可以寻找的
类相对

是否有一种有效的方法来清除这些部分中的所有
元素文本
print ('HEAD TO HEAD')
print ('------------')
print(team_left_name)
for stat in team_left_stat_names:
    print (stat.text)
for stat in team_left_stat_counts:
    print (stat.text)

print(team_right_name)
for stat in team_right_stat_names:
    print (stat.text)
for stat in team_right_stat_counts:
    print (stat.text)