如何跳过标记并移动到下一个标记-使用python进行web抓取_Python_Web Scraping

如何跳过标记并移动到下一个标记-使用python进行web抓取

python web-scraping

如何跳过标记并移动到下一个标记-使用python进行web抓取,python,web-scraping,Python,Web Scraping,我试图从乐购网站上搜集数据，以获得产品的名称和价格。下面是我的代码。有些产品没有价格，因为它们已经卖完了，Python给了我一个错误，因为没有东西可以刮。我希望它能够跳过该瓷砖，并转移到下一个如果价格不可用有人知道我怎么做吗 from bs4 import BeautifulSoup import requests #URL to be scraped url_to_scrape = 'https://www.tesco.com/groceries/en-GB/shop/fresh-food

我试图从乐购网站上搜集数据，以获得产品的名称和价格。下面是我的代码。有些产品没有价格，因为它们已经卖完了，Python给了我一个错误，因为没有东西可以刮。我希望它能够跳过该瓷砖，并转移到下一个如果价格不可用

有人知道我怎么做吗

from bs4 import BeautifulSoup
import requests

#URL to be scraped
url_to_scrape = 'https://www.tesco.com/groceries/en-GB/shop/fresh-food/all?page=1&count=48'
#Load html's plain data into a variable
plain_html_text = requests.get(url_to_scrape)
#parse the data
soup = BeautifulSoup(plain_html_text.text, "lxml")

#Get the name of the class
for name_of in soup.find_all('div',class_='product-tile-wrapper'):
    name =name_of.h3.a.text
    print(name)
    price = name_of.find('div', class_='price-details--wrapper')
    pricen =price.find('span', class_='value').text
    print(pricen)

使用

尝试

除了

块：

from bs4 import BeautifulSoup
import requests

#URL to be scraped
url_to_scrape = 'https://www.tesco.com/groceries/en-GB/shop/fresh-food/all?page=1&count=48'
#Load html's plain data into a variable
plain_html_text = requests.get(url_to_scrape)
#parse the data
soup = BeautifulSoup(plain_html_text.text, "lxml")

#Get the name of the class
for name_of in soup.find_all('div',class_='product-tile-wrapper'):
    try:
        name =name_of.h3.a.text
        print(name)
        price = name_of.find('div', class_='price-details--wrapper')
        pricen =price.find('span', class_='value').text
        print(pricen)
    except:
        pass

您还可以通过以下方式使其更具互动性：

from bs4 import BeautifulSoup
import requests

#URL to be scraped
url_to_scrape = 'https://www.tesco.com/groceries/en-GB/shop/fresh-food/all?page=1&count=48'
#Load html's plain data into a variable
plain_html_text = requests.get(url_to_scrape)
#parse the data
soup = BeautifulSoup(plain_html_text.text, "lxml")

#Get the name of the class
for name_of in soup.find_all('div',class_='product-tile-wrapper'):
    name =name_of.h3.a.text
    print(name)
    try:
        price = name_of.find('div', class_='price-details--wrapper')
        pricen =price.find('span', class_='value').text
        print(pricen)
    except:
        print('Sold Out')

使用

尝试

除了

块：

from bs4 import BeautifulSoup
import requests

#URL to be scraped
url_to_scrape = 'https://www.tesco.com/groceries/en-GB/shop/fresh-food/all?page=1&count=48'
#Load html's plain data into a variable
plain_html_text = requests.get(url_to_scrape)
#parse the data
soup = BeautifulSoup(plain_html_text.text, "lxml")

#Get the name of the class
for name_of in soup.find_all('div',class_='product-tile-wrapper'):
    try:
        name =name_of.h3.a.text
        print(name)
        price = name_of.find('div', class_='price-details--wrapper')
        pricen =price.find('span', class_='value').text
        print(pricen)
    except:
        pass

您还可以通过以下方式使其更具互动性：

from bs4 import BeautifulSoup
import requests

#URL to be scraped
url_to_scrape = 'https://www.tesco.com/groceries/en-GB/shop/fresh-food/all?page=1&count=48'
#Load html's plain data into a variable
plain_html_text = requests.get(url_to_scrape)
#parse the data
soup = BeautifulSoup(plain_html_text.text, "lxml")

#Get the name of the class
for name_of in soup.find_all('div',class_='product-tile-wrapper'):
    name =name_of.h3.a.text
    print(name)
    try:
        price = name_of.find('div', class_='price-details--wrapper')
        pricen =price.find('span', class_='value').text
        print(pricen)
    except:
        print('Sold Out')

谢谢你，真管用！你知道我如何更改代码，使其通过大量链接工作并获取数据。所以这个链接是关于新鲜食品的，我想获取冷冻食品、饮料等的数据，可以在So中作为另一个问题提问。请随意这样做，因为这与这个问题无关。非常感谢约书亚！非常有用，谢谢你的帮助！你知道我如何更改代码，使其通过大量链接工作并获取数据。所以这个链接是关于新鲜食品的，我想获取冷冻食品、饮料等的数据，可以在So中作为另一个问题提问。请随意这样做，因为这与这个问题无关。非常感谢约书亚！很有帮助