Python 如何使用beautifulsoup在UL类标记中查找li元素？_Python_Beautifulsoup

Python 如何使用beautifulsoup在UL类标记中查找li元素？

python

Python 如何使用beautifulsoup在UL类标记中查找li元素？,python,beautifulsoup,Python,Beautifulsoup,如何使用beautifulsoup在类标记中找到元素页面位于url参数内。一旦您知道需要浏览多少页，只需反复浏览： import requests import math page = 1 api_url = 'https://www.capitaland.com/apis/sg/en/properties/rafflescitysingaporeshoppingcentre/cl%3Aentity/tenants/cl%3Arelated-page/%2Fcontent%2Fcapi

如何使用beautifulsoup在

类标记中找到

元素




页面位于url参数内。一旦您知道需要浏览多少页，只需反复浏览：
import requests
import math

page = 1
api_url = 'https://www.capitaland.com/apis/sg/en/properties/rafflescitysingaporeshoppingcentre/cl%3Aentity/tenants/cl%3Arelated-page/%2Fcontent%2Fcapitaland%2Fsg%2Fmalls%2Frafflescity%2Fen%2Fstores/cl%3Asortby/jcr%3Atitle/asc/cl%3Aselectors/_rel_brandtenants_details/_rel_deals/_rel_properties_details/_rel_tenants_details/accepts/acceptsCapita3Eats/acceptsCapitacard/acceptsCapitavoucher/acceptschope/acceptseCapitavoucher/addressroadname/assettype/brand/capita3EatsLink/chopelink/city/country/countryCode/cq%3Atags/currency/dealExisted/enddate/endtime/entityType/entityname/firstPublished/jcr%3Atitle/listingTypePages/logoImgPath/malllocationnote/marketingcategory/nearesttrainstation/oldprice/pagePath/pageTitle/price/promotiontype/ribbon/ribboncolor/shortdescription/startdate/starttime/state/subtitle/thumbnail/tileColorScheme/tilesubtext/cl%3Apgcursor/{page}/16.json'.format(page=page)
headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.190 Safari/537.36'}

jsonData = requests.get(api_url, headers=headers, verify = False).json()

total_pages = math.ceil(jsonData['totalcount'] / 16)

links = []
for page in range(1,total_pages+1):
    print('Page: %s of %s' %(page,total_pages))
    if page == 1: 
        pass
    else:
        api_url = 'https://www.capitaland.com/apis/sg/en/properties/rafflescitysingaporeshoppingcentre/cl%3Aentity/tenants/cl%3Arelated-page/%2Fcontent%2Fcapitaland%2Fsg%2Fmalls%2Frafflescity%2Fen%2Fstores/cl%3Asortby/jcr%3Atitle/asc/cl%3Aselectors/_rel_brandtenants_details/_rel_deals/_rel_properties_details/_rel_tenants_details/accepts/acceptsCapita3Eats/acceptsCapitacard/acceptsCapitavoucher/acceptschope/acceptseCapitavoucher/addressroadname/assettype/brand/capita3EatsLink/chopelink/city/country/countryCode/cq%3Atags/currency/dealExisted/enddate/endtime/entityType/entityname/firstPublished/jcr%3Atitle/listingTypePages/logoImgPath/malllocationnote/marketingcategory/nearesttrainstation/oldprice/pagePath/pageTitle/price/promotiontype/ribbon/ribboncolor/shortdescription/startdate/starttime/state/subtitle/thumbnail/tileColorScheme/tilesubtext/cl%3Apgcursor/{page}/16.json'.format(page=page)
        jsonData = requests.get(api_url, headers=headers, verify = False).json()
               
    properties = jsonData['properties']
    for each in properties:
        pagePath = each['pagePath']
        links.append(pagePath)

print(links)

输出：
240链接
['https://www.capitaland.com/sg/malls/rafflescity/en/stores/chewy-junior', 'https://www.capitaland.com/sg/malls/rafflescity/en/stores/a-one-signature', 'https://www.capitaland.com/sg/malls/rafflescity/en/stores/aesop', 'https://www.capitaland.com/sg/malls/rafflescity/en/stores/aldo',... 'https://www.capitaland.com/sg/malls/rafflescity/en/stores/xing-ji-big-prawn-noodle-opening-soon', 'https://www.capitaland.com/sg/malls/rafflescity/en/stores/xw-western-grill', 'https://www.capitaland.com/sg/malls/rafflescity/en/stores/ya-kun-kaya-toast', 'https://www.capitaland.com/sg/malls/rafflescity/en/stores/ysl-beauty', 'https://www.capitaland.com/sg/malls/rafflescity/en/stores/_house_yamamoto']

它是从https://www.capitaland.com/apis/........r/1/16.json
您可以在浏览器的“网络”选项卡中找到要调用的API的完整url。其他人的开始链接是：https://www.capitaland.com/sg/malls/rafflescity/en/stores.html?category=foodandbeverage
我在页面中找不到所有商户的所有URL，有没有办法拉取页面中的所有商户链接？这意味着页面一次只能加载16个商户，所以我必须重复很多次才能得到所有的商家链接？可能吧。检查API调用中的limit参数。或者，如果页面要求您加载更多/has分页，则为我选择并监视网络选项卡以获取另一个API调用。“页面”是指API调用吗？