Python 在特定邮政编码下删除产品URL

Python 在特定邮政编码下删除产品URL,python,web-scraping,beautifulsoup,request,Python,Web Scraping,Beautifulsoup,Request,我正在尝试删除邮政编码为08041的产品链接。我已经编写了代码来删除没有邮政编码的产品,但不知道如何删除并发送针对08041下产品的请求 这是我的密码: import requests import random import time from bs4 import BeautifulSoup import wget import csv from fp.fp import FreeProxy def helloworld(url): r = requests.get(url)

我正在尝试删除邮政编码为08041的产品链接。我已经编写了代码来删除没有邮政编码的产品,但不知道如何删除并发送针对08041下产品的请求

这是我的密码:

import requests
import random
import time
from bs4 import BeautifulSoup
import wget
import csv
from fp.fp import FreeProxy


def helloworld(url):
    r = requests.get(url)
    print ('Status',r.status_code)
    #time.sleep(8)
    soup = BeautifulSoup(r.content,'html.parser')
    post = soup.find_all('a',"name")
    
    for href in post:
        if ( href.get('href')[1] == 'p'):
            href = href.get('href')
            print (href)

def page_counter():
    url1 = "https://soysuper.com/c/aperitivos#products"
    print (url1,'\n')
    helloworld(url1)
    
page_counter()


您可以使用后端端点模拟具有给定邮政编码的请求

2923
注意:cookie是硬编码的,但有效期为一年

以下是方法:

导入请求
标题={
“用户代理”:“Mozilla/5.0(X11;Linux x86_64)AppleWebKit/537.36(KHTML,如Gecko)Chrome/89.0.4389.105 Safari/537.36”,
“X-request-With”:“XMLHttpRequest”,
“饼干”:"Soyssuper=Eyjyxjj0ijoinja2 Wnkmzg5zdi5yzkwndu1nji3mzyziwzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyzyNmizmazmjawmdawmcjjmjmjmjmjmjmjmzmjmjmjmjmjmjmjmjmjmjmjmjmjmjmjmjmjmjmjmjmjmjmjmjmjmjmjmjmjmjmjmjmjmjmjmjmjmjmjmjmjmjmjmjmjmjmjmjmjmjmjmjmjmjmjmjmjmjmjmjmjmjnjmjmjmjmjmjmjmjnjnjnjnjnjnjnjnjmjnjnjnjnjnjnjnjnjnjnjnjnjnjnjnjnjnjnjnjnjnjnjnjnjnjnjnjnjZK2NWWZDC5MTG3MGU0NTA1GMWM精度JVKMTI0NDQ2OWRKNGU0NGFKMDU3MMmSjdLc6axaioiiiiIwoDa0MSJ9--166849121EE159A6FDB0C0FE8341032321D9B1;”
}
将requests.Session()作为连接:
r=连接。获取(“https://soysuper.com/supermarket?zipcode=08041,headers=headers)
headers[“请求Id”]=r.headers[“下一个请求Id”]
标题[“引用者”]=“https://soysuper.com/c/aperitivos"
products\u data=connection.get(“https://soysuper.com/c/aperitivos?products=1&page=1,headers=headers.json()
打印(产品数据[“产品”][“总计”])
输出:邮政编码为08041的产品总数

2923
您实际得到的是一个包含给定页面所有产品数据的
JSON
。这是网络选项卡中的外观

注意
pager
键。使用它“分页”API并获取更多产品信息