Python webscraping后无法从字典中检索值_Python_Dictionary_Web Scraping_Beautifulsoup

Python webscraping后无法从字典中检索值

python dictionary web-scraping

Python webscraping后无法从字典中检索值,python,dictionary,web-scraping,beautifulsoup,Python,Dictionary,Web Scraping,Beautifulsoup,我希望在座的各位能够回答我认为是一个简单的问题。我是一个完全的新手，一直在尝试从Archdaily网站创建一个图像webscraper。以下是我经过多次调试后的代码： #### - Webscraping 0.1 alpha - #### - Archdaily - import requests from bs4 import BeautifulSoup # Enter the URL of the webpage you want to download the images from

我希望在座的各位能够回答我认为是一个简单的问题。我是一个完全的新手，一直在尝试从Archdaily网站创建一个图像webscraper。以下是我经过多次调试后的代码：

#### - Webscraping 0.1 alpha -
#### - Archdaily - 

import requests
from bs4 import BeautifulSoup

# Enter the URL of the webpage you want to download the images from
page = 'https://www.archdaily.com/63267/ad-classics-house-vi-peter-eisenman/5037e0ec28ba0d599b000190-ad-classics-house-vi-peter-eisenman-image'

# Returns the webpage source code under page_doc
result = requests.get(page)
page_doc = result.content

# Returns the source code as BeautifulSoup object, as nested data structure
soup = BeautifulSoup(page_doc, 'html.parser')
img = soup.find('div', class_='afd-gal-items')
img_list = img.attrs['data-images']
for k, v in img_list():
    if k == 'url_large':
        print(v)

这些要素包括：

img = soup.find('div', class_='afd-gal-items')
img_list = img.attrs['data-images']

尝试隔离“数据图像”属性，如下所示：

Traceback (most recent call last):
  File "D:/Python/Programs/Webscraper/Webscraping v0.2alpha.py", line 29, in <module>
    print(jsonData['url_large'])
TypeError: list indices must be integers or slices, not str

正如您所看到的，或者可能我在这里完全错了，我试图从最终的字典列表中调用“url\u large”值，结果出现类型错误，如下所示：

Traceback (most recent call last):
  File "D:/Python/Programs/Webscraper/Webscraping v0.2alpha.py", line 23, in <module>
    for k, v in img_list():
TypeError: 'str' object is not callable

但这是一个半身像，如图所示：

Traceback (most recent call last):
  File "D:/Python/Programs/Webscraper/Webscraping v0.2alpha.py", line 29, in <module>
    print(jsonData['url_large'])
TypeError: list indices must be integers or slices, not str

回溯（最近一次呼叫最后一次）：
文件“D:/Python/Programs/Webscraper/Webscraping v0.2alpha.py”，第29行，在
打印（jsonData['url\u large']）
TypeError:列表索引必须是整数或片，而不是str

在更改这些字符串值时，我缺少一个步骤，但我不确定在哪里可以更改它们。我希望有人能帮我解决这个问题，谢谢

都是关于类型的

img_list

实际上不是一个列表，而是一个字符串。您尝试通过

img\u list（）

调用它，这会导致错误

使用

json.loads

将其转换为字典的想法是正确的。这里的错误非常简单-

jsonData

是一个列表，而不是一个字典。您有多个图像

您可以循环浏览列表。列表中的每个项目都是一个字典，您可以在列表中的每个字典中找到

url\u large

属性：

images\u json=img.attrs['data-images']
对于json.loads（images\u json）中的image\u属性：
打印（图像属性['url\u large']）