Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/331.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/apache-kafka/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 刮用靓汤与硒问题_Python_Selenium_Web Scraping_Beautifulsoup - Fatal编程技术网

Python 刮用靓汤与硒问题

Python 刮用靓汤与硒问题,python,selenium,web-scraping,beautifulsoup,Python,Selenium,Web Scraping,Beautifulsoup,现在我正在抓取这个url 我想把所有产品的评论都删掉,但是有错误。。 任何帮助我都很感激谢谢你:) 我的代码 import requests from selenium import webdriver from bs4 import BeautifulSoup as soup import time from selenium.webdriver.chrome.options import Options url = 'https://www.lazada.com.my/products/

现在我正在抓取这个url

我想把所有产品的评论都删掉,但是有错误。。 任何帮助我都很感激谢谢你:)

我的代码

import requests
from selenium import webdriver
from bs4 import BeautifulSoup as soup
import time
from selenium.webdriver.chrome.options import Options


url = 'https://www.lazada.com.my/products/xiaomi-mi-a1-4gb-ram-32gb- rom-i253761547-s336359472.html? spm=a2o4k.searchlistcategory.list.64.71546883QBZiNT&search=1'

chrome_options = Options()
#chrome_options.add_argument("--headless")
browser = webdriver.Chrome('/Users/e5/fyp/chromedriver', 
chrome_options=chrome_options)
browser.get(url)
time.sleep(0.1)


d = soup(requests.get('https://www.lazada.com.my/products/xiaomi-mi-a1-4gb-ram-32gb-rom-i253761547-s336359472.html?spm=a2o4k.searchlistcategory.list.64.71546883QBZiNT&search=1').text, 'html.parser')
results = list(map(int, filter(None, [i.text for i in d.find_all('button', {'class':'next-pagination-item'})])))
print (results)
for i in range(min(results), max(results)+1):

    browser.find_element_by_xpath('//*[@id="module_product_review"]/div/div[3]/div[2]/div/div/button[{i}]').click()
    page_soups = soup(browser.page_source, 'html.parser')
    headline = page_soups.findAll('div',attrs={"class":"item-content"})

    for item in headline:
        top = item.div
        text_headlines = top.text
        print(text_headlines)
我的错误

InvalidSelectorException: Message: invalid selector: Unable to locate an element with the xpath expression //*[@id="module_product_review"]/div/div[3]/div[2]/div/div/button[{i}] because of the following error:
SyntaxError: Failed to execute 'evaluate' on 'Document': The string '//*[@id="module_product_review"]/div/div[3]/div[2]/div/div/button[{i}]' is not a valid XPath expression.
  (Session info: chrome=69.0.3497.100)
  (Driver info: chromedriver=2.37.544315 (730aa6a5fdba159ac9f4c1e8cbc59bf1b5ce12b7),platform=Windows NT 10.0.17134 x86_64)

只需使用他们的json api,无需selenium或BeautifulSoup

import requests

count = 0
for i in range(3):
    count+=1
    url = ('https://my.lazada.com.my/pdp/review/getReviewList?'
        'itemId=253761547&pageSize=5&filter=0&sort=0&pageNo='+str(count))
    req = requests.get(url)
    data = req.json()
    for i in data['model']['items']:
        buyerName = i['buyerName']
        reviewContent = i['reviewContent']
        print(buyerName, reviewContent)

你明白了吗?你试过我的答案了吗?哇,很简单,谢谢,但我如何只使用json捕获“reviewContent”刚才我尝试工作,但为什么之后每次我尝试运行它时都会显示此错误---->9 data=req.json()JSONDecodeError:期望值:第2行第1列(字符2)谢谢你我很抱歉给你添麻烦了。。我是json新手……)不要在短时间内提出太多的请求,如果你这样做,网站将阻止你,使用每个请求的时间延迟,以避免被阻止!尝试后一段时间或从不同的ip它将工作!