Python 在搜索栏中输入“查询”并搜索结果_Python_Web Scraping_Beautifulsoup_Selenium Chromedriver

Python 在搜索栏中输入“查询”并搜索结果

python web-scraping

Python 在搜索栏中输入“查询”并搜索结果,python,web-scraping,beautifulsoup,selenium-chromedriver,Python,Web Scraping,Beautifulsoup,Selenium Chromedriver,我有一个数据库，里面有不同书籍的ISBN编号。我使用Python和Beautifulsoup收集它们。接下来，我想为这些书添加类别。书的分类有一个标准。一个名为的网站拥有符合标准的所有书籍和类别开始URL:https://www.bol.com/nl/ ISBN:9780062457738 搜索后的URL:https://www.bol.com/nl/p/the-subtle-art-of-not-giving-a-f-ck/9200000053655943/ HTML类别：您可以使用sele

我有一个数据库，里面有不同书籍的ISBN编号。我使用Python和Beautifulsoup收集它们。接下来，我想为这些书添加类别。书的分类有一个标准。一个名为的网站拥有符合标准的所有书籍和类别

开始URL:

https://www.bol.com/nl/

ISBN:

9780062457738

搜索后的URL:

https://www.bol.com/nl/p/the-subtle-art-of-not-giving-a-f-ck/9200000053655943/

HTML类别：

您可以使用selenium
定位输入框并在您的ISBN上循环，输入每个：
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
d = webdriver.Chrome('/path/to/chromedriver')
books = ['9780062457738']
for book in books:
  d.get('https://www.bol.com/nl/')
  e = d.find_element_by_id('searchfor')
  e.send_keys(book)
  e.send_keys(Keys.ENTER)
  #scrape page here 

现在，对于books
中的每本书ISBN，解决方案将在搜索框中输入值并加载所需页面。
您可以编写一个函数返回类别。您可以基于实际搜索，页面只需整理参数，您可以使用GET
import requests
from bs4 import BeautifulSoup as bs

def get_category(isbn): 
    r = requests.get(f'https://www.bol.com/nl/rnwy/search.html?Ntt={isbn}&searchContext=books_all') 
    soup = bs(r.content,'lxml')
    category = soup.select_one('#option_block_4 > li:last-child .breadcrumbs__link-label')

    if category is None:
        return 'Not found'
    else:
        return category.text

isbns = ['9780141311357', '9780062457738', '9780141199078']

for isbn in isbns:
    print(get_category(isbn))

您的代码试用版是什么？出现了什么错误？@Dev我没有收到任何错误。我只是不知道从哪里开始。（2）中的代码来自互联网，但我不知道如何正确使用webdriver。你知道怎么做吗？谢谢你的帮助。但是Ajax1234的解决方案对美沙酮有效。这个对你不管用吗？我测试了它，它似乎工作得很好。
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
d = webdriver.Chrome('/path/to/chromedriver')
books = ['9780062457738']
for book in books:
  d.get('https://www.bol.com/nl/')
  e = d.find_element_by_id('searchfor')
  e.send_keys(book)
  e.send_keys(Keys.ENTER)
  #scrape page here 

import requests
from bs4 import BeautifulSoup as bs

def get_category(isbn): 
    r = requests.get(f'https://www.bol.com/nl/rnwy/search.html?Ntt={isbn}&searchContext=books_all') 
    soup = bs(r.content,'lxml')
    category = soup.select_one('#option_block_4 > li:last-child .breadcrumbs__link-label')

    if category is None:
        return 'Not found'
    else:
        return category.text

isbns = ['9780141311357', '9780062457738', '9780141199078']

for isbn in isbns:
    print(get_category(isbn))