Python字典并没有按其应有的方式创建，它'；它被覆盖了_Python

Python字典并没有按其应有的方式创建，它'；它被覆盖了

python

Python字典并没有按其应有的方式创建，它'；它被覆盖了,python,Python,我有下面的代码，我的目标是抓取一个网站，然后用结果创建一个字典 from bs4 import BeautifulSoup as bs from urllib.request import urlopen, urlparse, Request attr = {} def get_data(): global attr page = 0 attr_numbers_parent = [] attr_dates_parent = [] while page

我有下面的代码，我的目标是抓取一个网站，然后用结果创建一个字典

from bs4 import BeautifulSoup as bs
from urllib.request import urlopen, urlparse, Request

attr = {}
def get_data():
    global attr
    page = 0

    attr_numbers_parent = []
    attr_dates_parent = []
    while page < 2:

        LOTTO_TYPES = ['mega-millions','powerball']
        BASE_URL = ('https://nylottery.ny.gov')
        BASE_URL_END = ('past-winning-numbers')
        PAGE = ('?page={}'.format(page))
        for lotto in LOTTO_TYPES:
            attr_numbers = []
            attr_dates = []
            fullpath = '{}/{}/{}{}'.format(BASE_URL,lotto,BASE_URL_END,PAGE)
            r = Request(fullpath, headers={'User-agent':'Mozilla/5.0'})
            html = urlopen(r)
            soup = bs(html, 'html.parser')

            # get date
            attr_date = soup.find('div', {'view-content'}).find_all('p',{'class':'result-date'})
            for item in attr_date:
                attr_dates.append(item.get_text().strip())
            attr_dates_parent.append(attr_dates)

            # get numbers
            attr_number = soup.find('div',{'class':'view-content'}).find_all('span', {'class':'numbers'})
            for item in attr_number:
                attr_numbers.append(item.get_text().strip())
            attr_numbers_parent.append(attr_numbers)
        page += 1
    attr[lotto] = [{'dates':attr_dates_parent, 'numbers': attr_numbers_parent}]
    return attr

x = get_data()
print(x)

从bs4导入美化组作为bs
从urllib.request导入urlopen、urlparse、request
attr={}
def get_data（）：
全局属性
第页=0
属性编号\u父项=[]
属性日期父项=[]
当页面<2时：
乐透类型=['mega-million'，'powerball']
基本URL=（'https://nylottery.ny.gov')
BASE_URL_END=（'pass-winning-numbers'）
页码=（'？页码={}'。格式（页码））
对于乐透类型中的乐透：
属性编号=[]
属性日期=[]
完整路径='{}/{}/{}{}'。格式（基本URL、乐透、基本URL、页面）
r=请求（完整路径，标题={'User-agent'：'Mozilla/5.0'}）
html=urlopen（r）
soup=bs（html，'html.parser'）
#约会
attr_date=soup.find（'div'，{'view-content'}）.find_all（'p'，{'class'：'result-date'}）
对于属性日期中的项目：
attr_dates.append（item.get_text（）.strip（））
attr\u dates\u parent.append（attr\u dates）
#得到数字
attr_number=soup.find（'div'，{'class'：'view-content'}）。find_all（'span'，{'class'：'numbers'}）
对于属性号中的项目：
attr_number.append（item.get_text（）.strip（））
attr\u numbers\u parent.append（attr\u numbers）
页码+=1
attr[lotto]=[{'dates'：attr\u dates\u parent，'number'：attr\u numbers\u parent}]
返回属性
x=获取_数据（）
打印（x）

代码返回字典

attr

时，仅使用“powerball”键，而“mega-millions”键丢失。它不是应该用循环创建两个关键点吗

我错过什么了吗

唯一一次你把任何东西放到字典的末尾：

attr[lotto]=…

。这不在循环中，因此此代码不能将多个内容放入字典中。当字典在循环中时，数据会重复@kaya3。然后修复代码并不像将该行从一个位置移动到另一个位置那么简单。但是很明显，这段代码不能返回包含两个项的字典（除非它已经有一个具有不同键的字典，因为它是

全局的，实际上不应该是这样）。@gmu将解释如果移动attr[lotto]时“数据被复制”的原因=XXX
循环中的一行是，您不断地附加到相同的attr\u numbers\u parent
和attr\u dates\u parent
列表。当涉及到调试时，我们有一个要求的理由——将问题简化为最简单的代码，重新生成代码有助于理解问题所在。下面是您的mcve:-现在请尝试完全理解执行流程（使用一些打印或步骤调试器跟踪代码执行可能会有所帮助），并进行修复，以获得预期的结果。作为旁注：“并行列表”是一种反模式，而不是日期列表和数字列表，您应该尝试构建一个（日期，数字）
元组或{date:xxx，number:yyy}
dict的列表。