Web scraping beautifulsoup href返回空字符串_Web Scraping_Beautifulsoup

Web scraping beautifulsoup href返回空字符串

web-scraping

Web scraping beautifulsoup href返回空字符串,web-scraping,beautifulsoup,Web Scraping,Beautifulsoup,我确信这很简单，但不知何故，我一直无法在跳到每个产品详细信息页面的a标签下获得href链接。我也没有看到任何javascript被包装起来。我错过了什么 import requests from bs4 import BeautifulSoup as bs from selenium import webdriver from selenium.webdriver.common.by import By from selenium.common.exceptions import Timeout

我确信这很简单，但不知何故，我一直无法在跳到每个产品详细信息页面的

标签下获得

href

链接。我也没有看到任何javascript被包装起来。我错过了什么

import requests
from bs4 import BeautifulSoup as bs
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time
import pandas as pd

urls = [
    'https://undefeated.com/search?type=product&q=nike'
] 

 final = []
with requests.Session() as s:
    for url in urls:
        driver = webdriver.Chrome('/Users/Documents/python/Selenium/bin/chromedriver')
        driver.get(url)
        products = [element for element in WebDriverWait(driver, 30).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@class='product-grid-item  ']")))]
        soup = bs(driver.page_source, 'lxml')
        time.sleep(1)
        href = soup.find_all['href']
        print(href)

输出：

[]

然后我尝试了

soup.find_all（'a'）

，它确实吐出了一大堆，包括我要找的

href

，但仍然无法专门提取href…

你只需找到所有

标记，然后尝试打印

href

属性

您可以请求。会话代码应如下所示：

with requests.Session() as s:
    for url in urls:
        driver = webdriver.Firefox()
        driver.get(url)
        products = [element for element in WebDriverWait(driver, 30).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@class='product-grid-item  ']")))]
        soup = bs(driver.page_source, 'lxml')
        time.sleep(1)
        a_links = soup.find_all('a')
        for a in a_links:
            print(a.get('href'))

然后所有的链接都会被打印出来