pythonwebscraper没有获得某些值

pythonwebscraper没有获得某些值,python,web-scraping,Python,Web Scraping,我的网络刮板无法获取“赔率”值,也不确定出了什么问题。对于每一条信息,我都使用try/except来查看元素是否可用。不过,我不确定获得赔率值有什么问题。谢谢你的帮助 import pandas as pd import requests from bs4 import BeautifulSoup import re url = 'https://www.ncaagamesim.com/college-basketball-predictions.asp' response = request

我的网络刮板无法获取“赔率”值,也不确定出了什么问题。对于每一条信息,我都使用try/except来查看元素是否可用。不过,我不确定获得赔率值有什么问题。谢谢你的帮助

import pandas as pd
import requests
from bs4 import BeautifulSoup
import re

url = 'https://www.ncaagamesim.com/college-basketball-predictions.asp'
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')

table = soup.find('table')

# Get column names
headers = table.find_all('th')
cols = [x.text for x in headers]

# Get all rows in table body
table_rows = table.find_all('tr')

rows = []
# Grab the text of each td, and put into a rows list
for each in table_rows[1:]:
    odd_avail = True
    data = each.find_all('td')
    time = data[0].text.strip()

    # Get matchup and odds
    try:
        matchup, odds = data[1].text.strip().split('\xa0')
        odd_margin = float(odds.split('by')[-1].strip())
    except:
        matchup = data[1].text.strip()
        odd_margin = '-'
        odd_avail = False

    # Get favored team
    try:
        odd_team_win = data[1].find_all('img')[-1]['title']
    except:
        odd_team_win = '-'
        odd_avail = False

    # Get simulation winner
    try:
        sim_team_win = data[2].find('img')['title']
    except:
        sim_team_win = '-'
        odd_avail = False

    awayTeam = matchup.split('@')[0].strip()
    homeTeam = matchup.split('@')[1].strip()

    # Get simulation margin
    try:
        sim_margin = float(re.findall("\d+\.\d+", data[2].text)[-1])
    except:
        sim_margin = '-'
        odd_avail = False

    # If all variables available, determine odds, simulation margin points, and optimal bet
    if odd_avail == True:
        if odd_team_win == sim_team_win:
            diff = abs(sim_margin - odd_margin)
            if sim_margin > odd_margin:
                bet = odd_team_win
            else:
                if odd_team_win == homeTeam:
                    bet = awayTeam
                else:
                    bet = homeTeam
        else:
            diff = odd_margin + sim_margin
            bet = sim_team_win
    else:
        diff = -1
        bet = '-'

    # Create table
    row = {cols[0]: time, 'Matchup': matchup, 'Odds Winner': odd_team_win, 'Odds': odd_margin,
           'Simulation Winner': sim_team_win, 'Simulation Margin': sim_margin, 'Diff': diff, 'Bet' : bet}
    rows.append(row)

df = pd.DataFrame(rows)
df = df.sort_values(by = ['Diff'], ascending = False)
print (df.to_string())
# df.to_csv('odds.csv', index=False)

当我运行这段代码时,一切都运行得很好,并得到所有其他值,但表中的所有赔率值都是“-”。

我在代码中添加了一些东西,以说明

  • 如果赔率相等(而不是没有赔率
  • 如果一个团队没有徽标,则仍需输入团队名称
  • 至于没有显示的可能性。检查csv文件是否存在。如果存在,可能只是您需要在pycharm中更改的首选项(可能只是删除了一些字符串)


    我已经在colab上运行了你的代码,似乎一切都正常。除了第14行是
    -
    ,由于某种原因,在我运行程序时,它没有在我的PyCharm或终端中显示。你知道为什么可能是这样吗?是否需要安装某些东西?我不确定,你可以检查异常消息更多信息。@pumpkinchugger你看过csv文件了吗?那么当赔率赢家与模拟赢家比赛时?或者你是说你想让两支球队分开?我做了一个小小的改变,看看你的意思是什么。啊。是的。只需拆分mathup列。实际上我就是这么做的。我只是将比赛拆分成了一列球队,主队在单独的专栏里不用担心。我喜欢处理体育数据。
    import pandas as pd
    import requests
    from bs4 import BeautifulSoup
    import re
    
    url = 'https://www.ncaagamesim.com/college-basketball-predictions.asp'
    response = requests.get(url)
    soup = BeautifulSoup(response.text, 'html.parser')
    
    table = soup.find('table')
    
    # Get column names
    headers = table.find_all('th')
    cols = [x.text for x in headers]
    
    # Get all rows in table body
    table_rows = table.find_all('tr')
    
    rows = []
    # Grab the text of each td, and put into a rows list
    for each in table_rows[1:]:
        odd_avail = True
        data = each.find_all('td')
        time = data[0].text.strip()
    
        # Get matchup and odds
        try:
            matchup, odds = data[1].text.strip().split('\xa0')
            odd_margin = float(odds.split('by')[-1].strip())
        except:
            matchup = data[1].text.strip()
            if 'Even' in matchup:
                matchup, odds = data[1].text.strip().split('\xa0')
                odd_margin = 0
            else:
                odd_margin = '-'
                odd_avail = False
                
        awayTeam = matchup.split('@')[0].strip()
        homeTeam = matchup.split('@')[1].strip()
    
        # Get favored team
        try:
            odd_team_win = data[1].find_all('img')[-1]['title']
        except:
            odd_team_win = '-'
            odd_avail = False
    
        # Get simulation winner
        try:
            sim_team_win = data[2].find('img')['title']
        except:
            if 'wins' in data[2].text:
                sim_team_win = data[2].text.split('wins')[0].strip()
            else:
                sim_team_win = '-'
                odd_avail = False
    
        # Get simulation margin
        try:
            sim_margin = float(re.findall("\d+\.\d+", data[2].text)[-1])
        except:
            sim_margin = '-'
            odd_avail = False
    
        # If all variables available, determine odds and simulation margin points
        if odd_avail == True:
            if odd_team_win == sim_team_win:
                diff = abs(sim_margin - odd_margin)
            else:
                diff = odd_margin + sim_margin
        else:
            diff = '-'
    
        # Create table
        row = {cols[0]: time, 'Away Team': awayTeam, 'Home Team':homeTeam, 'Odds Winner': odd_team_win, 'Odds': odd_margin,
               'Simulation Winner': sim_team_win, 'Simulation Margin': sim_margin, 'Diff': diff}
        rows.append(row)
    
    df = pd.DataFrame(rows)
    print (df.to_string())
    # df.to_csv('odds.csv', index=False)