Python 用靓汤抓取篮球数据_Python_Beautifulsoup

Python 用靓汤抓取篮球数据

python

Python 用靓汤抓取篮球数据,python,beautifulsoup,Python,Beautifulsoup,我只想返回这3个玩家的名字（在url中）。。当前代码返回他们的姓名、球队和篮球协会。我是否可以在代码中指定只返回名称从以下位置刮取数据：您就快到了，但首先让我提到这一点，因为您似乎是Python新手：您不应该命名变量str，因为它隐藏了内置的str类，所以我在下面的代码中修改了这一点。重要的修改是，我将您的.findAll（'a'）更改为.findAll（'td'，{class'：'left active'}），检查元素，我们可以看到所有玩家的名字都在一个标记中，classleft acti

我只想返回这3个玩家的名字（在url中）。。当前代码返回他们的姓名、球队和篮球协会。我是否可以在代码中指定只返回名称

从以下位置刮取数据：

您就快到了，但首先让我提到这一点，因为您似乎是Python新手：您不应该命名变量

str

，因为它隐藏了内置的str类，所以我在下面的代码中修改了这一点。重要的修改是，我将您的

.findAll（'a'）

更改为

.findAll（'td'，{class'：'left active'}）

，检查元素，我们可以看到所有玩家的名字都在一个

标记中，class

left active

。我还将迭代变量改为

元素

，而不是复数形式，因此从语义上讲它更有意义。还请注意，您发布的代码没有正确识别，但我认为这只是粘贴到此处时的格式问题

import requests
from bs4 import BeautifulSoup

def bball_spider(url):
    source_code = requests.get(url)
    plain_text = source_code.text
    soup = BeautifulSoup(plain_text, "html.parser")

    # Players
    for element in soup.find('table',{'id' : 'stats'}).findAll('td',{'class':'left active'}):
        names = element.string
        print(names)

url = '''https://www.basketball-reference.com/play-index/psl_finder.cgi?request=1&match=single&type=totals&per_minute_base=36&per_poss_base=100&season_start=1&season_end=-1&lg_id=NBA&age_min=0&age_max=99&is_playoffs=N&height_min=0&height_max=99&year_min=2017&year_max=2017&birth_country_is=Y&as_comp=gt&as_val=0&pos_is_g=Y&pos_is_gf=Y&pos_is_f=Y&pos_is_fg=Y&pos_is_fc=Y&pos_is_c=Y&pos_is_cf=Y&c1stat=fg3_pct&c1comp=gt&c1val=40&c2stat=fg3a&c2comp=gt&c2val=164&c3stat=dbpm&c3comp=gt&c3val=0&order_by=ws'''
bball_spider(url)

这将打印：

Chris Paul
Otto Porter
Joe Ingles

Chris Paul
Otto Porter
Joe Ingles