Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/powerbi/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 使用带有索引的for循环,解析为单独的列表_Python_Beautifulsoup - Fatal编程技术网

Python 使用带有索引的for循环,解析为单独的列表

Python 使用带有索引的for循环,解析为单独的列表,python,beautifulsoup,Python,Beautifulsoup,我有一个问题,我已经找到了答案,但它的编码方式似乎有点挥霍和有点资源沉重。我想看看是否有一种我在概念上认为应该有效的方法,但却无法正确编码 问题在于以下代码: from bs4 import BeautifulSoup as bsoup import requests as reqs pagetoparse = 'https://fbref.com/en/squads/986a26c1/Northampton-Town' page = reqs.get(pagetoparse) status

我有一个问题,我已经找到了答案,但它的编码方式似乎有点挥霍和有点资源沉重。我想看看是否有一种我在概念上认为应该有效的方法,但却无法正确编码

问题在于以下代码:

from bs4 import BeautifulSoup as bsoup
import requests as reqs

pagetoparse = 'https://fbref.com/en/squads/986a26c1/Northampton-Town'
page = reqs.get(pagetoparse)
status = page.status_code
parsepage = bsoup(page.content, 'html.parser')

playerlist = []
positionlist = []
agelist = []

# Create playerlist - unique instances
findplayers = parsepage.find_all('th',attrs={"data-stat":"player"})
    for player in findplayers:
        addplayer = player.find_next('a').get_text()
        if addplayer not in playerlist and addplayer != 'coverage note':
            playerlist.append(addplayer)

# Create positionlist - non-unique
findinfo = parsepage.find_all('td',attrs={"data-stat":'position'})
    for position in findinfo:
        addposition = position.get_text()
        if addposition != 'coverage note':
            positionlist.append(addposition)

# Create positionlist - non-unique
findinfo = parsepage.find_all('td',attrs={"data-stat":'age'})
    for age in findinfo:
        addage = age.get_text()
        if addage != 'coverage note':
            agelist.append(addage)
目前我所做的是这样,这是可行的,但问题是我更喜欢在索引中运行整个data stat选项:

toparse=['player'、'position'、'age']等

然而,我无法实现这一点的地方是,将这些单独的索引成员添加到各自的列表中。我可以构造一个for循环来实现这一点,但它们最终都位于相同的索引中。在自己的列表中运行data stat变量时,您是否可以帮助使列表也更改为下一个?即代码将列表从playerlist交换到位置列表等


我已经设法分别运行代码来实现这一点。但是它缺乏灵活性,而且我想说它变得有点太长,无法管理。

使用
find\u next
函数获取下一个元素

O/p:


使用
find_next
函数获取下一个元素

O/p:


可以制作
{选项:对应的\u列表}
的字典。问题是,如果列表是独立的变量,那么在列表中只添加“选项”是没有意义的。将它们放在一个列表中可以方便地更改选项集,但由于您仍然必须为它们维护单独的列表,这一优势将无效。我认为,要么两者都做,要么按原样离开。可以制作一本
{option:correlative_list}
字典。问题是,如果列表是独立的变量,那么在列表中只添加“选项”是没有意义的。将它们放在一个列表中可以方便地更改选项集,但由于您仍然必须为它们维护单独的列表,这一优势将无效。我的意见是要么两个都做,要么照原样走。
from bs4 import BeautifulSoup as bsoup
import requests as reqs

pagetoparse = 'https://fbref.com/en/squads/986a26c1/Northampton-Town'
page = reqs.get(pagetoparse)
parsepage = bsoup(page.content, 'html.parser')
playerlist = []
findplayers = parsepage.find_all('th',attrs={"data-stat":"player"})

for player in findplayers:
    playerdict = {}
    addplayer = player.find_next('a').get_text()
    if addplayer not in playerlist and addplayer != 'coverage note':
        playerdict['player'] = addplayer
        position,age = player.find_next('td'),player.find_next('td')
        while True:
            position = position.find_next('td')
            if position.has_attr("data-stat") and position['data-stat'] in 'position':
                playerdict['position'] = position.get_text()
                break

        while True:
            position = position.find_next('td')
            if position.has_attr("data-stat") and position['data-stat'] in 'age':
                playerdict['age'] = position.get_text()
                break

        playerlist.append(playerdict)

print(playerlist)
[{'player':'David Cornell','position':'GK','age':'27'},{'player':'David Cornell','position':'GK','age':'27'},
{'player':'Aaron Pierre','position':'DF','age':'25'},{'player':'Sam Hoskins','position':'FW','age':'25'},
{'player':'David Buchanan','position':'DF','age':'32'},{'player':'Sam Foley','position':'MF','age':'31'},
{'player':'Ash Taylor','position':'MF,DF','age':'27'},{'player':'Jordan Turnbull','position':'DF','age':'23'},
{'player':'Andy Williams','position':'MF,FW','age':'31'},{'player':"John-Joe O'Toole",'position':'MF','age':'29'},
{'player':'Shay Facey','position':'DF','age':'23'},{'player':'Shaun McWilliams','position':'MF','age':'19'},
{'player':'Kevin van Veen','position':'FW','age':'27'},{'player':'Matt Crooks','position':'MF,DF','age':'24'},
{'player':'Daniel Powell','position':'MF,FW','age':'27'},{'player':'Jack Bridge','position':'FW','age':'22'},
{'player':'Charlie Goode','position':'DF','age':'22'},{'player':'Hakeem Odoffin','position':'DF','age':'20'},
{'player':'Dean Bowditch','position':'FW','age':'32'},{'player':'Junior Morias','position':'FW','age':'23'},
{'player':'Jay Williams','position':'DF','age':''},{'player':'Joe Powell','position':'MF','age':'19'},
{'player':'Billy Waters','position':'MF,FW','age':'23'},{'player':'Marvin Sordell','position':'FW','age':'27'},
{'player':'Timi Elšnik','position':'MF','age':'20'},{'player':'Leon Barnett','position':'DF','age':'32'},
{'player':'Scott Pollock','position':'MF','age':''},{'player':'George Cox','position':'DF','age':''},
{'player':'Ryan Hughes','position':'MF','age':''},{'player':'Morgan Roberts','position':'','age':''},
{'player':'David Cornell','position':'GK','age':'27'},{'player':'David Cornell','position':'GK','age':'27'}]