Web抓取-一起打印值-Python_Python_Web Scraping_Printing_Attributeerror

Web抓取-一起打印值-Python

python web-scraping printing

Web抓取-一起打印值-Python,python,web-scraping,printing,attributeerror,Python,Web Scraping,Printing,Attributeerror,所以我试着去刮CS:GO皮肤，我试着返回：皮肤名称，价格和收藏-按顺序这是我尝试过的许多方法之一 from bs4 import BeautifulSoup import requests import urllib3 urllib3.disable_warnings() def webscrape(): url = "https://csgostash.com/weapon/AWP" res = requests.get(url = url) soup = Bea

所以我试着去刮CS:GO皮肤，我试着返回：皮肤名称，价格和收藏-按顺序

这是我尝试过的许多方法之一

from bs4 import BeautifulSoup
import requests
import urllib3
urllib3.disable_warnings()

def webscrape():

    url = "https://csgostash.com/weapon/AWP"
    res = requests.get(url = url)
    soup = BeautifulSoup(res.text, "html.parser")

    titles = soup.find_all('div', class_="well result-box nomargin")
    prices = soup.find_all('div', class_="price")
    collection = soup.find_all('div', class_="collection")

    for title in titles:
        title = title.find('a')
        if title:
            title = title.text

    for price in prices:
        price = price.find('p')
        if price:
            price = price.text

    for cases in collection:
        cases = cases.find('p')
        if price:
            cases = cases.text
    print(title.text, price.text, collection.text)

webscrape()

这将返回：

    print(title.text, price.text, collection.text)
AttributeError: 'NoneType' object has no attribute 'text'

我希望它按顺序返回这三个值。例如。安全壳破坏'\n' 40.57澳元-271.90澳元'\n' 碎网盒

等等。有些皮肤有两个价格集，我希望两个价格集都打印出来

我已经让它更有效地展示了我正在努力解决的问题

from bs4 import BeautifulSoup
import requests
import urllib3
urllib3.disable_warnings()

def webscrape():

    url = "https://csgostash.com/weapon/AWP"
    res = requests.get(url = url)
    soup = BeautifulSoup(res.text, "html.parser")
    names = " "
    price = " "
    cases = " "
    titles = soup.find_all('div', class_="well result-box nomargin")
    prices = soup.find_all('div', class_="price")
    collection = soup.find_all('div', class_="collection")

    for name in titles:
        a_field = name.find('a')
        if a_field:
            names = a_field.text + '\n' + names

    for money in prices:
        p_field = money.find('p')
        if p_field:
            price = p_field.text + '\n' + price

    for case in collection:
        case_field = case.find('p')
        if case_field:
            cases = case_field.text + '\n' + cases
    print(names, price, cases)

webscrape()

这将打印我在网页上查找的所有信息，但我希望这些信息组合在一起，就像我希望皮肤的价格和集合以皮肤的名称打印一样。现在它打印所有的名字，然后是所有的价格，然后是所有的收藏

titles = soup.find_all('div', class_="well result-box nomargin")

for title in titles:
    title = title.find('a')
    if title:
        title = title.text

您正在覆盖循环每次迭代的数据；我一点也不清楚你认为你在做什么。我认为这是唯一可行的方法，如果你的最终迭代找到了你想要的。。。在这种情况下，

title

作为您找到的最后一个

text

值退出

最后，您尝试使用该属性的

.text

。这几乎肯定会以某种不希望的方式失败

要获取您看到的错误，标题中的最后一项包含“a”，并且具有

文本属性无；稍后，当您尝试提取None
属性时，您会得到指示的错误
相反，试试看
titles = soup.find_all('div', class_="well result-box nomargin")

for title in titles:
    a_field = title.find('a')
    if a_field:
        break

这会使您在找到所需属性后立即退出搜索循环。
标题的范围仅在循环的内。您必须在循环外声明变量。当您循环查看标题、价格和案例时，您会检查它是否找到了元素。打印时，不检查是否找到元素。因此，如果它没有找到元素，您仍将尝试访问None的text属性object@narendra-乔杜里：那是不正确的；给定的运行时错误显示为其他错误。请重新访问Python作用域规则。@Prune您是正确的<代码>标题
将在循环外可用。我的一点想法渐渐消失了。有点Python/Java混搭了！你有具体的问题吗？您从错误消息中了解到了什么？