Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/325.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
如何在selenium Python中分页?_Python_Selenium_Pagination - Fatal编程技术网

如何在selenium Python中分页?

如何在selenium Python中分页?,python,selenium,pagination,Python,Selenium,Pagination,我需要分页,通过网页,并保存在一个列表中的每一页的HTML HTML在第一页看起来是这样的 class=“sc-4j28w0-1 fDeSdf”的第一个元素是一个箭头“>” <li disabled="" class="sc-4j28w0-1 fDeSdf"></li> <li data-testid="current-page-item" class="sc-4j28w0-1 sc-4j28w0-2 jDlZyl">1</li> <li c

我需要分页,通过网页,并保存在一个列表中的每一页的HTML

HTML在第一页看起来是这样的 class=“sc-4j28w0-1 fDeSdf”的第一个元素是一个箭头“>”

<li disabled="" class="sc-4j28w0-1 fDeSdf"></li>
<li data-testid="current-page-item" class="sc-4j28w0-1 sc-4j28w0-2 jDlZyl">1</li>
<li class="sc-4j28w0-1 lhEbhI"><span class="sc-4j28w0-3 jAKnhT">2</span></li>
<li class="sc-4j28w0-1 lhEbhI"><span class="sc-4j28w0-3 jAKnhT">3</span></li>
<li class="sc-4j28w0-1 lhEbhI"></li>

但问题是循环没有停止,它一直保存最后一页

我也试过这样做:

wait = WebDriverWait(driver, 10) 

search = wait.until(EC.element_to_be_clickable((By.XPATH, '/html/body/div/div/div[2]/div[2]/div/ol/li[5]')))

while search.get_property('disabled') is False:
    search.click()
    time.sleep(5)
    html = driver.page_source
    soup_news = BeautifulSoup(html)
    news_list.append(soup_news)
但是我得到了一个错误

---------------------------------------------------------------------------
StaleElementReferenceException            Traceback (most recent call last)
<ipython-input-51-49e862d6475f> in <module>
     34 
     35 
---> 36 while search.is_enabled():
     37     try:
     38         search.click()

~\AppData\Local\Continuum\anaconda3\lib\site-packages\selenium\webdriver\remote\webelement.py in is_enabled(self)
    157     def is_enabled(self):
    158         """Returns whether the element is enabled."""
--> 159         return self._execute(Command.IS_ELEMENT_ENABLED)['value']
    160 
    161     def find_element_by_id(self, id_):

~\AppData\Local\Continuum\anaconda3\lib\site-packages\selenium\webdriver\remote\webelement.py in _execute(self, command, params)
    631             params = {}
    632         params['id'] = self._id
--> 633         return self._parent.execute(command, params)
    634 
    635     def find_element(self, by=By.ID, value=None):

~\AppData\Local\Continuum\anaconda3\lib\site-packages\selenium\webdriver\remote\webdriver.py in execute(self, driver_command, params)
    319         response = self.command_executor.execute(driver_command, params)
    320         if response:
--> 321             self.error_handler.check_response(response)
    322             response['value'] = self._unwrap_value(
    323                 response.get('value', None))

~\AppData\Local\Continuum\anaconda3\lib\site-packages\selenium\webdriver\remote\errorhandler.py in check_response(self, response)
    240                 alert_text = value['alert'].get('text')
    241             raise exception_class(message, screen, stacktrace, alert_text)
--> 242         raise exception_class(message, screen, stacktrace)
    243 
    244     def _value_or_default(self, obj, key, default):

StaleElementReferenceException: Message: The element reference of <li class="sc-4j28w0-1 lhEbhI"> is stale; either the element is no longer attached to the DOM, it is not in the current frame context, or the document has been refreshed
---------------------------------------------------------------------------
StaleElementReferenceException回溯(最后一次最近调用)
在里面
34
35
--->36搜索时。是否启用了搜索()
37尝试:
38.搜索。单击()
已启用中的~\AppData\Local\Continuum\anaconda3\lib\site packages\selenium\webdriver\remote\webelement.py(自)
157 def已启用(自):
158“返回元素是否已启用。”“”
-->159返回self.\u execute(Command.IS\u ELEMENT\u ENABLED)['value']
160
161 def按id查找元素(self,id):
~\AppData\Local\Continuum\anaconda3\lib\site packages\selenium\webdriver\remote\webelement.py in\u execute(self、command、params)
631参数={}
632参数['id']=self.\u id
-->633返回self.\u parent.execute(命令,参数)
634
635 def find_元素(self,by=by.ID,value=None):
~\AppData\Local\Continuum\anaconda3\lib\site packages\selenium\webdriver\remote\webdriver.py处于执行状态(self,driver\u命令,参数)
319 response=self.command\u executor.execute(driver\u command,params)
320如果响应:
-->321自我错误处理程序检查响应(响应)
322响应['value']=self.\u展开值(
323响应。获取('值',无))
检查响应中的~\AppData\Local\Continuum\anaconda3\lib\site packages\selenium\webdriver\remote\errorhandler.py(self,response)
240警报文本=值['alert']。获取('text')
241引发异常类(消息、屏幕、堆栈跟踪、警报文本)
-->242引发异常类(消息、屏幕、堆栈跟踪)
243
244定义值或默认值(self、obj、key、default):
StaleElementReferenceException:消息:
  • 的元素引用已过时;元素不再附加到DOM,它不在当前框架上下文中,或者文档已刷新

  • 感谢您的帮助

    这里有不同的分页方法。我要强调一点:

  • 获取当前页码
  • 搜索下一个,如果未找到则退出
  • 守则:

    while True:
       current_page_number = int(driver.find_element_by_css_selector('li[data-testid=current-page-item]').text)
    
       print(f"Processing page {current_page_number}..")
    
       try:
           next_page_link = driver.find_element_by_xpath(f'.//li[span = "{current_page_number + 1}"]')
           next_page_link.click()
        except NoSuchElementException:
            print(f"Exiting. Last page: {current_page_number}.")
            break
    
       # TODO: save the page
    

    这里有不同的分页方法。我要强调一点:

  • 获取当前页码
  • 搜索下一个,如果未找到则退出
  • 守则:

    while True:
       current_page_number = int(driver.find_element_by_css_selector('li[data-testid=current-page-item]').text)
    
       print(f"Processing page {current_page_number}..")
    
       try:
           next_page_link = driver.find_element_by_xpath(f'.//li[span = "{current_page_number + 1}"]')
           next_page_link.click()
        except NoSuchElementException:
            print(f"Exiting. Last page: {current_page_number}.")
            break
    
       # TODO: save the page
    

    你是说
    else:break
    而不是
    else:pass
    吗?你是说
    else:break
    而不是
    else:pass
    ?两者都试过了,不起作用你好,我试过了,得到一个错误类型error:int()参数必须是字符串、类似字节的对象或数字,不是“FirefoxWebElement”@AnnaDmitrieva哦,是的,忘记了
    。text
    那里,An'ka,prover'请:)你好,我试过并得到一个错误TypeError:int()参数必须是一个字符串,一个类似对象或数字的字节,不是“FirefoxWebElement”@AnnaDmitrieva哦,是的,忘记了
    。text
    那里,一个'ka,prover'请:)
    ---------------------------------------------------------------------------
    StaleElementReferenceException            Traceback (most recent call last)
    <ipython-input-51-49e862d6475f> in <module>
         34 
         35 
    ---> 36 while search.is_enabled():
         37     try:
         38         search.click()
    
    ~\AppData\Local\Continuum\anaconda3\lib\site-packages\selenium\webdriver\remote\webelement.py in is_enabled(self)
        157     def is_enabled(self):
        158         """Returns whether the element is enabled."""
    --> 159         return self._execute(Command.IS_ELEMENT_ENABLED)['value']
        160 
        161     def find_element_by_id(self, id_):
    
    ~\AppData\Local\Continuum\anaconda3\lib\site-packages\selenium\webdriver\remote\webelement.py in _execute(self, command, params)
        631             params = {}
        632         params['id'] = self._id
    --> 633         return self._parent.execute(command, params)
        634 
        635     def find_element(self, by=By.ID, value=None):
    
    ~\AppData\Local\Continuum\anaconda3\lib\site-packages\selenium\webdriver\remote\webdriver.py in execute(self, driver_command, params)
        319         response = self.command_executor.execute(driver_command, params)
        320         if response:
    --> 321             self.error_handler.check_response(response)
        322             response['value'] = self._unwrap_value(
        323                 response.get('value', None))
    
    ~\AppData\Local\Continuum\anaconda3\lib\site-packages\selenium\webdriver\remote\errorhandler.py in check_response(self, response)
        240                 alert_text = value['alert'].get('text')
        241             raise exception_class(message, screen, stacktrace, alert_text)
    --> 242         raise exception_class(message, screen, stacktrace)
        243 
        244     def _value_or_default(self, obj, key, default):
    
    StaleElementReferenceException: Message: The element reference of <li class="sc-4j28w0-1 lhEbhI"> is stale; either the element is no longer attached to the DOM, it is not in the current frame context, or the document has been refreshed
    
    while True:
       current_page_number = int(driver.find_element_by_css_selector('li[data-testid=current-page-item]').text)
    
       print(f"Processing page {current_page_number}..")
    
       try:
           next_page_link = driver.find_element_by_xpath(f'.//li[span = "{current_page_number + 1}"]')
           next_page_link.click()
        except NoSuchElementException:
            print(f"Exiting. Last page: {current_page_number}.")
            break
    
       # TODO: save the page