Python 使用beautifulsoup抓取动态网站

Python 使用beautifulsoup抓取动态网站,python,web-scraping,beautifulsoup,Python,Web Scraping,Beautifulsoup,我正在抓取一个网站nykaa.com,链接是()。共有25页,每页动态加载数据。我找不到数据的来源。此外,当刮的数据,我只能得到20个产品,成为多余的,列表成为420个产品 import requests from bs4 import BeautifulSoup import unicodecsv as csv urls = [] l1 = [] for page in range(1,5): result = requests.get("https://www.nykaa.com/s

我正在抓取一个网站nykaa.com,链接是()。共有25页,每页动态加载数据。我找不到数据的来源。此外,当刮的数据,我只能得到20个产品,成为多余的,列表成为420个产品

import requests
from bs4 import BeautifulSoup
import unicodecsv as csv


urls = []
l1 = []


for page in range(1,5):
result = requests.get("https://www.nykaa.com/skin/moisturizers/serums-essence/c/8397?root=nav_3&page_no=" + str(page))
src = result.content

soup = BeautifulSoup(src,'lxml')

for div_tag in soup.find_all("div", class_ = "card-wrapper-container col-xs-12 col-sm-6 col-md-4"):
    for div1_tag in soup.find_all("div", class_ = "product-list-box card desktop-cart"):
        h2_tag = div1_tag.find("h2").find("span")
        price_tag = div1_tag.find("div", class_ = "price-info")
        l1 = [h2_tag.get_text(),price_tag.get_text()]
        urls.append(l1)

        #print(urls)


with open('xyz.csv', 'wb') as myfile:
     wr = csv.writer(myfile)
     wr.writerows(urls)

上面的代码为我提供了大约1200个产品名称和价格的列表,其中只有30到40个是唯一的,否则都是多余的。我想唯一地获取25页的数据,总共有486个独特的产品。我还使用selenium来单击下一页链接,但也没有成功。

这显示了在所有页面(包括确定页数)上以循环方式发出页面所做的请求(在“网络”选项卡中查看)<代码>结果是一个列表,您可以轻松写入csv

import requests, math, csv

page = '1'

def append_new_rows(data):
    for i in data:
        if 'name' in i:
            results.append([i['name'], i['final_price']])

with requests.Session() as s:
    r = s.get(f'https://www.nykaa.com/gludo/products/list?pro=false&filter_format=v2&app_version=null&client=react&root=nav_3&page_no={page}&category_id=8397').json()
    results_per_page = 20
    total_results = r['response']['total_found']
    num_pages = math.ceil(total_results/results_per_page)
    results = []
    append_new_rows(r['response']['products'])

    for page in range(2, num_pages + 1):
        r = s.get(f'https://www.nykaa.com/gludo/products/list?pro=false&filter_format=v2&app_version=null&client=react&root=nav_3&page_no={page}&category_id=8397').json()
        append_new_rows(r['response']['products'])

with open("data.csv", "w", encoding="utf-8-sig", newline='') as csv_file:
    w = csv.writer(csv_file, delimiter = ",", quoting=csv.QUOTE_MINIMAL)
    w.writerow(['Name','Price'])
    for row in results:
        w.writerow(row)

您可以使用
selenium

from bs4 import BeautifulSoup as soup
from selenium import webdriver
d = webdriver.Chrome('/path/to/chromedriver')
d.get('https://www.nykaa.com/skin/moisturizers/serums-essence/c/8397')
def get_products(_d):
  return [{'title':(lambda x:x if not x else x.text)(i.find('div', {'class':'m-content__product-list__title'})), 'price':(lambda x:x if not x else x.text)(i.find('span', {'class':'post-card__content-price-offer'}))} for i in _d.find_all('div', {'class':'card-wrapper-container col-xs-12 col-sm-6 col-md-4'})]

s = soup(d.page_source, 'html.parser')
r = [list(filter(None, get_products(s)))]
while 'disable-event' not in s.find('li', {'class':'next'}).attrs['class']:
  d.get(f"https://www.nykaa.com{s.find('li', {'class':'next'}).a['href']}")
  s = soup(d.page_source, 'html.parser')
  r.append(list(filter(None, get_products(s))))
示例输出(前三页):


谢谢你的回答。请你解释一下我做错了什么,这样我以后就不会再犯同样的错误了。请让我知道。内容是从不同的url动态加载的,您可以在“网络”选项卡中找到,也就是说,当使用请求时,它并不全部存在于您的url中,因为js不会运行,而且这些额外的url不会通过xhr提供内容。同样感谢您。请让我知道我做错了什么
[[{'title': 'The Face Shop Calendula Essential Moisture Serum', 'price': '₹1320 '}, {'title': 'Palmers Cocoa Butter Formula Skin Perfecting Ultra Hydrating...', 'price': '₹970 '}, {'title': "Cheryl's Cosmeceuticals Clarifi Acne Anti Blemish Serum", 'price': '₹875 '}, {'title': 'Estee Lauder Advanced Night Repair Synchronized Recovery Com...', 'price': '₹1250 '}, {'title': 'Estee Lauder Advanced Night Repair Synchronized Recovery Com...', 'price': '₹1250 '}, {'title': 'Estee Lauder Advanced Night Repair Synchronized Recovery Com...', 'price': '₹3900 '}, {'title': 'Klairs Freshly Juiced Vitamin Drop', 'price': '₹1492 '}, {'title': 'Innisfree The Green Tea Seed Serum', 'price': '₹1950 '}, {'title': "Kiehl's Midnight Recovery Concentrate", 'price': '₹2100 '}, {'title': 'The Face Shop White Seed Brightening Serum', 'price': '₹1990 '}, {'title': 'Biotique Bio Dandelion Visibly Ageless Serum', 'price': '₹230 '}, {'title': None, 'price': None}, {'title': 'St.Botanica Vitamin C 20% + Vitamin E & Hyaluronic Acid Faci...', 'price': '₹1499 '}, {'title': 'Biotique Bio Coconut Whitening & Brightening Cream', 'price': '₹199 '}, {'title': 'Neutrogena Fine Fairness Brightening Serum', 'price': '₹849 '}, {'title': "Kiehl's Clearly Corrective Dark Spot Solution", 'price': '₹4300 '}, {'title': "Kiehl's Clearly Corrective Dark Spot Solution", 'price': '₹4300 '}, {'title': 'Lakme Absolute Perfect Radiance Skin Lightening Serum', 'price': '₹960 '}, {'title': 'St.Botanica Hyaluronic Acid + Vitamin C, E Facial Serum', 'price': '₹1499 '}, {'title': 'Jeva Vitamin C Serum with Hyaluronic Acid for Anti Aging and...', 'price': '₹350 '}, {'title': 'Lotus Professional Phyto-Rx Whitening & Brightening Serum', 'price': '₹595 '}], [{'title': 'The Face Shop Chia Seed Moisture Recharge Serum', 'price': '₹1890 '}, {'title': 'Lotus Herbals WhiteGlow Skin Whitening & Brightening Gel Cre...', 'price': '₹280 '}, {'title': 'Lakme 9 to 5 Naturale Aloe Aqua Gel', 'price': '₹200 '}, {'title': 'Estee Lauder Advanced Night Repair Synchronized Recovery Com...', 'price': '₹5900 '}, {'title': 'Mixify Unloc Skin Glow Serum', 'price': '₹499 '}, {'title': 'St.Botanica Retinol 2.5% + Vitamin E & Hyaluronic Acid Profe...', 'price': '₹1499 '}, {'title': 'LANEIGE Hydration Combo Set', 'price': '₹3000 '}, {'title': 'Biotique Bio Dandelion Ageless Visiblly Serum', 'price': '₹690 '}, {'title': 'The Moms Co. Natural Vita Rich Face Serum', 'price': '₹699 '}, {'title': "It's Skin Power 10 Formula VC Effector", 'price': '₹950 '}, {'title': "Kiehl's Powerful-Strength Line-Reducing Concentrate", 'price': '₹5100 '}, {'title': 'Olay Natural White Light Instant Glowing Fairness Skin Cream', 'price': '₹99 '}, {'title': 'Plum Green Tea Skin Clarifying Concentrate', 'price': '₹881 '}, {'title': 'Olay Total Effects 7 In One Anti-Ageing Smoothing Serum', 'price': '₹764 '}, {'title': 'Elizabeth Arden Ceramide Daily Youth Restoring Serum 60 Caps...', 'price': '₹5850 '}, {'title': None, 'price': None}, {'title': 'Olay Regenerist Advanced Anti-Ageing Micro-Sculpting Serum', 'price': '₹1699 '}, {'title': 'Lakme Absolute Argan Oil Radiance Overnight Oil-in-Serum', 'price': '₹945 '}, {'title': 'The Face Shop Mango Seed Silk Moisturizing Emulsion', 'price': '₹1890 '}, {'title': 'The Face Shop Calendula Essential Good to Glow Combo', 'price': '₹2557 '}, {'title': 'Garnier Skin Naturals Light Complete Serum Cream', 'price': '₹69 '}], [{'title': 'Clinique Moisture Surge Hydrating Supercharged Concentrate', 'price': '₹2550 '}, {'title': 'LANEIGE Sleeping Mask Combo', 'price': '₹3000 '}, {'title': 'Klairs Rich Moist Soothing Serum', 'price': '₹1492 '}, {'title': 'Estee Lauder Idealist Pore Minimizing Skin Refinisher', 'price': '₹5500 '}, {'title': 'O3+ Whitening & Brightening Serum', 'price': '₹1475 '}, {'title': 'Elizabeth Arden Ceramide Daily Youth Restoring Serum 90 Caps...', 'price': '₹6900 '}, {'title': 'Olay Natural White Light Instant Glowing Fairness Skin Cream', 'price': '₹189 '}, {'title': "L'Oreal Paris White Perfect Clinical Expert Anti-Spot Whiten...", 'price': '₹1480 '}, {'title': 'belif Travel Kit', 'price': '₹1499 '}, {'title': 'Forest Essentials Advanced Soundarya Serum With 24K Gold', 'price': '₹3975 '}, {'title': "L'Occitane Immortelle Reset Serum", 'price': '₹4500 '}, {'title': 'Lakme Absolute Skin Gloss Reflection Serum 30ml', 'price': '₹990 '}, {'title': 'Neutrogena Hydro Boost Emulsion', 'price': '₹999 '}, {'title': 'Innisfree Anti-Aging Set', 'price': '₹2350 '}, {'title': 'Clinique Fresh Pressed 7-Day System With Pure Vitamin C', 'price': '₹2400 '}, {'title': 'The Face Shop The Therapy Premier Serum', 'price': '₹2490 '}, {'title': 'The Body Shop Vitamin E Overnight Serum In Oil', 'price': '₹1695 '}, {'title': 'Jeva Vitamin C Serum with Hyaluronic Acid for Anti Aging and...', 'price': '₹525 '}, {'title': 'Olay Regenerist Micro Sculpting Cream and White Radiance Hyd...', 'price': '₹2698 '}, {'title': 'The Face Shop Yehwadam Pure Brightening Serum', 'price': '₹4350 '}]]