Python Selenium-提取表中的所有URL并迭代,直到“下一步”按钮消失
我试图提取所有的URL,并在按下下一步按钮的位置迭代,直到没有下一步按钮为止。如果可能的话,我想打开每个URL。请给我指出正确的方向好吗 您需要按下搜索按钮的网站是 链接到Python Selenium-提取表中的所有URL并迭代,直到“下一步”按钮消失,python,selenium,Python,Selenium,我试图提取所有的URL,并在按下下一步按钮的位置迭代,直到没有下一步按钮为止。如果可能的话,我想打开每个URL。请给我指出正确的方向好吗 您需要按下搜索按钮的网站是 链接到 下面是您要寻找的示例 from bs4 import BeautifulSoup as Soup from selenium import webdriver import pandas as pd import time driver = webdriver.Chrome() driver.get("https:/
下面是您要寻找的示例
from bs4 import BeautifulSoup as Soup
from selenium import webdriver
import pandas as pd
import time
driver = webdriver.Chrome()
driver.get("https://monerobenchmarks.info/")
page = Soup(driver.page_source, features='html.parser')
final_list = []
def parsh_table():
table = page.find('table')
table_rows = table.find_all('tr')
for tr in table_rows:
td = tr.find_all('td')
row = [i.text for i in td]
final_list.extend(row)
def next_bu():
next_button = driver.find_element_by_xpath('//*[@id="cpu_next"]')
next_button.click()
# put range of pages
for _ in range(1,2):
parsh_table()
time.sleep(2)
next_bu()
print(final_list)
给你
from selenium import webdriver
driver = webdriver.Chrome(executable_path=r"C:\Users\matt_\Documents\Python Scripts\Selenium\chromedriver.exe")
driver.get("https://publicaccess.aberdeencity.gov.uk/online-applications/search.do?action=monthlyList")
driver.find_element_by_css_selector("input[value='Search']").click()
def parse():
links = driver.find_elements_by_xpath('//*[@id="searchresults"]/li/a')
for link in links:
print(link.text, link.get_attribute("href"))
try:
driver.find_element_by_class_name('next').click()
parse()
except:
print('complete')
parse()
您可以使用以下简单逻辑检查元素是否存在:
if len(driver.find_elements_by_css_selector('.next')) > 0:
请尝试以下代码:
driver.get('https://publicaccess.aberdeencity.gov.uk/online-applications/search.do?action=monthlyList')
search_btn = WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, '.button.primary')))
search_btn.click()
condition = True
while condition:
links = WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, 'li.searchresult a')))
for link in links:
print(link.get_attribute('href'))
if len(driver.find_elements_by_css_selector('.next')) > 0:
driver.find_element_by_css_selector('.next').click()
else:
condition = False
driver.quit()
以下内容:
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
这算是一张桌子吗?==$0是,只需更改“下一步”按钮xpath和驱动程序中的链接。获取“您的网站\u URL”并更改解析器\u表函数此链接可能会有所帮助。https://stackoverflow.com/questions/16915856/extracting-unordered-list-for-a-particular-div-beautifulsoupCould 请按搜索按钮,你就可以开始了。我一直在试着这样做,但没有任何进一步的进展。嗨,我试过这样做,做了同样的缩进,但在我跑完之后,它没有返回任何东西。它会到达正确的页面,但不会遍历next,也不会返回任何URL。我非常感谢@Dmitry目前的帮助:也许我不太理解你的问题,但它会在每个页面上打印每个url,直到到达最后一页,如果你想返回url列表,只需定义列表变量并将它们添加到其中即可。如果我错了,请纠正我。它只是不归还任何东西。这个想法是正确的。对不起,没有说清楚
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC