试图让python程序打印出从web抓取中选择的统计信息

试图让python程序打印出从web抓取中选择的统计信息,python,beautifulsoup,Python,Beautifulsoup,我对beautiful soup是个新手,我一直在寻找一种方法,让用户输入他们想要的团队和他们每周的工作内容。然后让脚本打印出该周的某些统计数据。在输出中,当我输入team和week number时,它直接进入命令行 这是我的密码: import requests from bs4 import BeautifulSoup team = input('''What team are you looking for? crd - Arizona Cardinals atl

我对beautiful soup是个新手,我一直在寻找一种方法,让用户输入他们想要的团队和他们每周的工作内容。然后让脚本打印出该周的某些统计数据。在输出中,当我输入team和week number时,它直接进入命令行

这是我的密码:

import requests  
from bs4 import BeautifulSoup  

team = input('''What team are you looking for?
    crd - Arizona Cardinals
    atl - Atlanta Falcons
    rav - Baltimore Ravens
    buf - Buffalo Bills
    car - Carolina Panthers
    chi - Chicago Bears
    cin - Cincinnati Bengals
    cle - Cleveland Browns
    dal - Dallas Cowboys
    den - Denver Broncos
    det - Detroit Lions
    gnb - Green Bay Packers
    htx - Houston Texans
    clt - Indianapolis Colts
    jax - Jacksonville Jaguars
    kan - Kansas City Chiefs
    sdg - Los Angeles Chargers
    ram - Los Angeles Rams
    mia - Miami Dolphins
    min - Minnesota Vikings
    nwe - New England Patriots
    nor - New Orleans Saints
    nyg - New York Giants
    nyj - New York Jets
    rai - Oakland Raiders
    phi - Philadelphia Eagles
    pit - Pittsburgh Steelers
    sfo - San Fransisco 49ers
    sea - Seattle Seahawks
    tam - Tampa Bay Buccaneers
    oti - Tennessee Titans
    was - Washington Football Team

    Enter the 3 letter code for the team: ''')
week = int(input('What week are you looking for? '))
  
url = 'https://www.pro-football-reference.com/teams/' + team.lower() + '/2019.htm'  
page = requests.get(url)  
  
soup = BeautifulSoup(page.content, 'html.parser')     

week_num = soup.find_all('th', attrs={"data-stat": "week_num", "class": "right", "scope": "row"})
total_off = soup.find_all('td', attrs={"data-stat": "yards_off", "class": "right"})
total_def = soup.find_all('td', attrs={"data-stat": "yards_def", "class": "right"})
pass_yards_off = soup.find_all('td', attrs={"data-stat": "pass_yds_off", "class": "right"})
pass_yards_def = soup.find_all('td', attrs={"data-stat": "pass_yds_def", "class": "right"})
rush_yards_off = soup.find_all('td', attrs={"data-stat": "rush_yds_off", "class": "right"})
rush_yards_def = soup.find_all('td', attrs={"data-stat": "rush_yds_def", "class": "right"})
team_score = soup.find_all('td', attrs={"data-stat": "pts_off", "class": "right"})
opp_score = soup.find_all('td', attrs={"data-stat": "pts_def", "class": "right"})




for i in range(len(week_num)):
    if week in week_num:
        print('Week Number: ' + week_num[i].text.strip(),
            'Total Off: ' + total_off[i].text.strip(),
            'Total Def: ' + total_def[i].text.strip(),
            'Passing Yards Off: ' + pass_yards_off[i].text.strip(),
            'Passing Yards Def: ' + pass_yards_def[i].text.strip(),
            'Rushing Yards Off: ' + rush_yards_off[i].text.strip(),
            'Rushing Yards Def: ' + rush_yards_def[i].text.strip(), '\n')
以下是我运行它时的输出:

What team are you looking for?
    crd - Arizona Cardinals
    atl - Atlanta Falcons
    rav - Baltimore Ravens
    buf - Buffalo Bills
    car - Carolina Panthers
    chi - Chicago Bears
    cin - Cincinnati Bengals
    cle - Cleveland Browns
    dal - Dallas Cowboys
    den - Denver Broncos
    det - Detroit Lions
    gnb - Green Bay Packers
    htx - Houston Texans
    clt - Indianapolis Colts
    jax - Jacksonville Jaguars
    kan - Kansas City Chiefs
    sdg - Los Angeles Chargers
    ram - Los Angeles Rams
    mia - Miami Dolphins
    min - Minnesota Vikings
    nwe - New England Patriots
    nor - New Orleans Saints
    nyg - New York Giants
    nyj - New York Jets
    rai - Oakland Raiders
    phi - Philadelphia Eagles
    pit - Pittsburgh Steelers
    sfo - San Fransisco 49ers
    sea - Seattle Seahawks
    tam - Tampa Bay Buccaneers
    oti - Tennessee Titans
    was - Washington Football Team

    Enter the 3 letter code for the team: nwe
What week are you looking for? 6

必须更改for循环中的if条件

import requests  
from bs4 import BeautifulSoup  

team = input('''What team are you looking for?
    crd - Arizona Cardinals
    atl - Atlanta Falcons
    rav - Baltimore Ravens
    buf - Buffalo Bills
    car - Carolina Panthers
    chi - Chicago Bears
    cin - Cincinnati Bengals
    cle - Cleveland Browns
    dal - Dallas Cowboys
    den - Denver Broncos
    det - Detroit Lions
    gnb - Green Bay Packers
    htx - Houston Texans
    clt - Indianapolis Colts
    jax - Jacksonville Jaguars
    kan - Kansas City Chiefs
    sdg - Los Angeles Chargers
    ram - Los Angeles Rams
    mia - Miami Dolphins
    min - Minnesota Vikings
    nwe - New England Patriots
    nor - New Orleans Saints
    nyg - New York Giants
    nyj - New York Jets
    rai - Oakland Raiders
    phi - Philadelphia Eagles
    pit - Pittsburgh Steelers
    sfo - San Fransisco 49ers
    sea - Seattle Seahawks
    tam - Tampa Bay Buccaneers
    oti - Tennessee Titans
    was - Washington Football Team

    Enter the 3 letter code for the team: ''')

week = int(input('What week are you looking for? '))
  
url = 'https://www.pro-football-reference.com/teams/' + team.lower() + '/2019.htm'  
page = requests.get(url)  

soup = BeautifulSoup(page.content, 'html.parser')     

week_num = soup.find_all('th', attrs={"data-stat": "week_num", "class": "right", "scope": "row"})
total_off = soup.find_all('td', attrs={"data-stat": "yards_off", "class": "right"})
total_def = soup.find_all('td', attrs={"data-stat": "yards_def", "class": "right"})
pass_yards_off = soup.find_all('td', attrs={"data-stat": "pass_yds_off", "class": "right"})
pass_yards_def = soup.find_all('td', attrs={"data-stat": "pass_yds_def", "class": "right"})
rush_yards_off = soup.find_all('td', attrs={"data-stat": "rush_yds_off", "class": "right"})
rush_yards_def = soup.find_all('td', attrs={"data-stat": "rush_yds_def", "class": "right"})
team_score = soup.find_all('td', attrs={"data-stat": "pts_off", "class": "right"})
opp_score = soup.find_all('td', attrs={"data-stat": "pts_def", "class": "right"})

try:
    print('Week Number: ' + week_num[week].text.strip(),
            'Total Off: ' + total_off[week].text.strip(),
            'Total Def: ' + total_def[week].text.strip(),
            'Passing Yards Off: ' + pass_yards_off[week].text.strip(),
            'Passing Yards Def: ' + pass_yards_def[week].text.strip(),
            'Rushing Yards Off: ' + rush_yards_off[week].text.strip(),
            'Rushing Yards Def: ' + rush_yards_def[week].text.strip(), '\n')
except Exception as e:
    print(e)
crd
2
的输出:

Week Number: 3 Total Off: 248 Total Def: 413 Passing Yards Off: 127 Passing Yards Def: 240 Rushing Yards Off: 121 Rushing Yards Def: 173

我们实际上可以从表中动态创建团队选择。您还可以使用pandas获取表,然后按周数过滤,而不是迭代

*注意:您需要
pip安装选项

import pandas as pd
import requests
from bs4 import BeautifulSoup
import choice

url= 'https://www.pro-football-reference.com/teams/'
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
teams = soup.find_all('th')

# Get the links to the teams in the table
teams_dict = {}
for each in teams:
    if each.find('a'):
        teams_dict[each.text] = each.find('a')['href']

   
team_choice = choice.Menu(teams_dict.keys()).ask()
week = input('What week are you looking for? ')

url = 'https://www.pro-football-reference.com{team_url}2019.htm'.format(team_url=teams_dict[team_choice])
df = pd.read_html(url,attrs={'id':'games'})[0]

new_col_names = [col[-1] if 'Unnamed' in col[0] else '_'.join(col) for col in df.columns]

# for loop equivalent to the list comprehension above
#new_col_names = []
#for col in df.columns:
#    if 'Unnamed' in col[0]:
#        new_col_names.append(col[-1])
#    else:
#        new_col_names.append('_'.join(col))

# List comprehension equivilant to above loop
#new_col_names = [col[-1] if 'Unnamed' in col[0] else '_'.join(col) for col in df.columns]

df.columns = new_col_names
df['Week'] = df['Week'].astype(str)
week_stats = df[df['Week']==week]

cols = ['Week','Offense_TotYd','Defense_TotYd','Offense_PassY','Defense_PassY','Offense_RushY','Defense_RushY']
print (week_stats[cols].to_string())
输出:对于NE第6周

  Week  Offense_TotYd  Defense_TotYd  Offense_PassY  Defense_PassY  Offense_RushY  Defense_RushY
5    6          427.0          213.0          313.0          161.0          114.0           52.0

令人惊叹的!这似乎奏效了。注意到它打印了前一周的内容,所以如果你输入第2周,它会给出第3周,所以我必须输入
[第1周]
,以获得正确的一周。谢谢你的快速回答。我得多读一些关于熊猫的书。这是一个梦幻足球联赛,我的一个朋友手动更新,想让它更容易。给我发一封电子邮件。杰森。schvach@gmail.com我写了一篇关于fanatsy football/fanduel的数据科学论文(现在还在写),但我也自动处理了所有这些数据和东西,所以也许可以帮你