Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/308.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
在python中使用selenium进行导航_Python_Selenium_Selenium Webdriver - Fatal编程技术网

在python中使用selenium进行导航

在python中使用selenium进行导航,python,selenium,selenium-webdriver,Python,Selenium,Selenium Webdriver,我正在使用Python和Selenium抓取这个网站。但它目前只抓取7月份的前10页,它将“下一步”按钮的前一个同级的页码转换为int,并单击“下一个页数”——“1”,但当它到达第10页后,它就停止了 URL- 有人能帮我把所有的书页都刮下来吗 def pagination( driver ): data = [] last_element = driver.find_element_by_xpath('//a[ contains( concat( " ", normalize-sp

我正在使用Python和Selenium抓取这个网站。但它目前只抓取7月份的前10页,它将“下一步”按钮的前一个同级的页码转换为int,并单击“下一个页数”——“1”,但当它到达第10页后,它就停止了

URL-

有人能帮我把所有的书页都刮下来吗

def pagination( driver ):
   data = []
   last_element = driver.find_element_by_xpath('//a[ contains( concat( " ", normalize-space( @class ), " "), " next ") ]/preceding-sibling::a[1]')
   if last_element is None:
    number_of_pages = 1
else:
    number_of_pages = int( last_element.text )
# data = [ getData( driver ) ]
data.extend(getData(driver))
for i in range(number_of_pages - 1):
    driver.find_element_by_xpath('//a[ contains( concat( " ", normalize-space( @class ), " "), " next ") ]').click()
    data.extend( getData( driver ) )
    time.sleep(1)
return data

页面的数量似乎为10

找到另一种方法来找出有多少页

您可以使用while循环检查“下一页”按钮是否可用,如果可用,请继续,否则-这是最后一页

像这样:

while next_button_element.is_displayed():
    // Do the action that is currently in the for loop

页面的数量似乎为10

找到另一种方法来找出有多少页

您可以使用while循环检查“下一页”按钮是否可用,如果可用,请继续,否则-这是最后一页

像这样:

while next_button_element.is_displayed():
    // Do the action that is currently in the for loop
您可以使用的代码:

while True:
    data.extend(getData(driver))
    try:
        driver.find_element_by_css_selector('a.next').click()
    except:
        break
您可以使用的代码:

while True:
    data.extend(getData(driver))
    try:
        driver.find_element_by_css_selector('a.next').click()
    except:
        break

听着,我知道你在你之前的一个问题上是从我的电脑里计算总页数的。在上一个案例中,由于最后一个页码是直接提供给我们的,所以它起了作用,但这里的情况并非如此

解决方案:

showing_text = driver.find_element_by_xpath("//span[@class='showing']").text    #Showing 1-10 of 174
number_of_entries_text = showing_text.split("of",1)[1]        # 174 as text
number_of_entries = int( re.findall(r'\d+',number_of_entries_text)[0])  #174 as int
number_of_pages = (number_of_entries/10) + 1   #18
def pagination( driver ):
   data = []
   last_element = driver.find_element_by_xpath("//span[@class='showing']")
   if last_element is None:
      number_of_pages = 1
   else:
      showing_text = driver.find_element_by_xpath("//span[@class='showing']").text              number_of_entries_text = showing_text.split("of",1)[1]        
      number_of_entries = int( re.findall(r'\d+',number_of_entries_text)[0])  
      number_of_pages = (number_of_entries/10) +1   

   for i in range(number_of_pages - 1):
       driver.find_element_by_xpath('//a[ contains( concat( " ", normalize-space( @class ), " "), " next ") ]').click()
       time.sleep(1)
虽然页数不是直接可用的,但条目总数是-

现在,正如你在上面7月份的截图中看到的,这个数字是174。假设您将分页长度(单个页面中的条目数)设置为默认值10,则页面数应为18(17页,每页10条目数,其余4条目数增加一页)

因此,计算页数的逻辑应该很简单。如果您在
total\u entries
变量中以某种方式获得了该条目的总数,则页数应为(取自:

Python默认情况下通过除法运算符返回下限整数,因此
174/10
将返回
17
,添加
+1
将返回
18
。因此,页面数为-18

现在,要提取条目的总数,请使用下面的定位器查找包含该条目的
元素

driver.find_element_by_xpath('//span[@class='showing']')
但此元素包含如下文本-
显示174个
中的1-10个。您只需要整个字符串中的
174
部分。为此,首先提取“of”之后的字符串,然后将其转换为int

从文本中提取总条目数为int的算法:

showing_text = driver.find_element_by_xpath("//span[@class='showing']").text    #Showing 1-10 of 174
number_of_entries_text = showing_text.split("of",1)[1]        # 174 as text
number_of_entries = int( re.findall(r'\d+',number_of_entries_text)[0])  #174 as int
number_of_pages = (number_of_entries/10) + 1   #18
def pagination( driver ):
   data = []
   last_element = driver.find_element_by_xpath("//span[@class='showing']")
   if last_element is None:
      number_of_pages = 1
   else:
      showing_text = driver.find_element_by_xpath("//span[@class='showing']").text              number_of_entries_text = showing_text.split("of",1)[1]        
      number_of_entries = int( re.findall(r'\d+',number_of_entries_text)[0])  
      number_of_pages = (number_of_entries/10) +1   

   for i in range(number_of_pages - 1):
       driver.find_element_by_xpath('//a[ contains( concat( " ", normalize-space( @class ), " "), " next ") ]').click()
       time.sleep(1)
最终代码:

showing_text = driver.find_element_by_xpath("//span[@class='showing']").text    #Showing 1-10 of 174
number_of_entries_text = showing_text.split("of",1)[1]        # 174 as text
number_of_entries = int( re.findall(r'\d+',number_of_entries_text)[0])  #174 as int
number_of_pages = (number_of_entries/10) + 1   #18
def pagination( driver ):
   data = []
   last_element = driver.find_element_by_xpath("//span[@class='showing']")
   if last_element is None:
      number_of_pages = 1
   else:
      showing_text = driver.find_element_by_xpath("//span[@class='showing']").text              number_of_entries_text = showing_text.split("of",1)[1]        
      number_of_entries = int( re.findall(r'\d+',number_of_entries_text)[0])  
      number_of_pages = (number_of_entries/10) +1   

   for i in range(number_of_pages - 1):
       driver.find_element_by_xpath('//a[ contains( concat( " ", normalize-space( @class ), " "), " next ") ]').click()
       time.sleep(1)
注意:

showing_text = driver.find_element_by_xpath("//span[@class='showing']").text    #Showing 1-10 of 174
number_of_entries_text = showing_text.split("of",1)[1]        # 174 as text
number_of_entries = int( re.findall(r'\d+',number_of_entries_text)[0])  #174 as int
number_of_pages = (number_of_entries/10) + 1   #18
def pagination( driver ):
   data = []
   last_element = driver.find_element_by_xpath("//span[@class='showing']")
   if last_element is None:
      number_of_pages = 1
   else:
      showing_text = driver.find_element_by_xpath("//span[@class='showing']").text              number_of_entries_text = showing_text.split("of",1)[1]        
      number_of_entries = int( re.findall(r'\d+',number_of_entries_text)[0])  
      number_of_pages = (number_of_entries/10) +1   

   for i in range(number_of_pages - 1):
       driver.find_element_by_xpath('//a[ contains( concat( " ", normalize-space( @class ), " "), " next ") ]').click()
       time.sleep(1)

我认为我的解决方案更好,因为你不必反复检查任何元素是否可用,也不必捕获任何异常。你只需直接获取页数,然后多次单击“下一步”按钮。

听着,我知道你在你前面的一个问题中是想从我的计算总页数的。在上一个案例中,由于最后一个页码是直接提供给我们的,所以它起了作用,但这里的情况并非如此

解决方案:

showing_text = driver.find_element_by_xpath("//span[@class='showing']").text    #Showing 1-10 of 174
number_of_entries_text = showing_text.split("of",1)[1]        # 174 as text
number_of_entries = int( re.findall(r'\d+',number_of_entries_text)[0])  #174 as int
number_of_pages = (number_of_entries/10) + 1   #18
def pagination( driver ):
   data = []
   last_element = driver.find_element_by_xpath("//span[@class='showing']")
   if last_element is None:
      number_of_pages = 1
   else:
      showing_text = driver.find_element_by_xpath("//span[@class='showing']").text              number_of_entries_text = showing_text.split("of",1)[1]        
      number_of_entries = int( re.findall(r'\d+',number_of_entries_text)[0])  
      number_of_pages = (number_of_entries/10) +1   

   for i in range(number_of_pages - 1):
       driver.find_element_by_xpath('//a[ contains( concat( " ", normalize-space( @class ), " "), " next ") ]').click()
       time.sleep(1)
虽然页数不是直接可用的,但条目总数是-

现在,正如您在上面7月份的屏幕截图中所看到的,这个数字是174。假设您将分页长度(单个页面中的条目数)设置为默认值10,那么页面数应该是18(17页,每个页面包含10条条目,其余4条条目多出一页)

因此,计算页数的逻辑应该很简单。如果您在
total\u entries
变量中以某种方式获得了该条目的总数,则页数应该是(取自:

Python默认情况下通过除法运算符返回下限整数,因此
174/10
将返回
17
,添加
+1
将返回
18
。因此,页面数为-18

现在,要提取条目的总数,请使用下面的定位器查找包含该条目的
元素

driver.find_element_by_xpath('//span[@class='showing']')
但此元素包含如下文本-
显示174个
中的1-10个。您只需要整个字符串中的
174
部分。为此,首先提取“of”之后的字符串,然后将其转换为int

从文本中提取总条目数为int的算法:

showing_text = driver.find_element_by_xpath("//span[@class='showing']").text    #Showing 1-10 of 174
number_of_entries_text = showing_text.split("of",1)[1]        # 174 as text
number_of_entries = int( re.findall(r'\d+',number_of_entries_text)[0])  #174 as int
number_of_pages = (number_of_entries/10) + 1   #18
def pagination( driver ):
   data = []
   last_element = driver.find_element_by_xpath("//span[@class='showing']")
   if last_element is None:
      number_of_pages = 1
   else:
      showing_text = driver.find_element_by_xpath("//span[@class='showing']").text              number_of_entries_text = showing_text.split("of",1)[1]        
      number_of_entries = int( re.findall(r'\d+',number_of_entries_text)[0])  
      number_of_pages = (number_of_entries/10) +1   

   for i in range(number_of_pages - 1):
       driver.find_element_by_xpath('//a[ contains( concat( " ", normalize-space( @class ), " "), " next ") ]').click()
       time.sleep(1)
最终代码:

showing_text = driver.find_element_by_xpath("//span[@class='showing']").text    #Showing 1-10 of 174
number_of_entries_text = showing_text.split("of",1)[1]        # 174 as text
number_of_entries = int( re.findall(r'\d+',number_of_entries_text)[0])  #174 as int
number_of_pages = (number_of_entries/10) + 1   #18
def pagination( driver ):
   data = []
   last_element = driver.find_element_by_xpath("//span[@class='showing']")
   if last_element is None:
      number_of_pages = 1
   else:
      showing_text = driver.find_element_by_xpath("//span[@class='showing']").text              number_of_entries_text = showing_text.split("of",1)[1]        
      number_of_entries = int( re.findall(r'\d+',number_of_entries_text)[0])  
      number_of_pages = (number_of_entries/10) +1   

   for i in range(number_of_pages - 1):
       driver.find_element_by_xpath('//a[ contains( concat( " ", normalize-space( @class ), " "), " next ") ]').click()
       time.sleep(1)
注意:

showing_text = driver.find_element_by_xpath("//span[@class='showing']").text    #Showing 1-10 of 174
number_of_entries_text = showing_text.split("of",1)[1]        # 174 as text
number_of_entries = int( re.findall(r'\d+',number_of_entries_text)[0])  #174 as int
number_of_pages = (number_of_entries/10) + 1   #18
def pagination( driver ):
   data = []
   last_element = driver.find_element_by_xpath("//span[@class='showing']")
   if last_element is None:
      number_of_pages = 1
   else:
      showing_text = driver.find_element_by_xpath("//span[@class='showing']").text              number_of_entries_text = showing_text.split("of",1)[1]        
      number_of_entries = int( re.findall(r'\d+',number_of_entries_text)[0])  
      number_of_pages = (number_of_entries/10) +1   

   for i in range(number_of_pages - 1):
       driver.find_element_by_xpath('//a[ contains( concat( " ", normalize-space( @class ), " "), " next ") ]').click()
       time.sleep(1)

我认为我的解决方案更好,因为您不必反复检查任何元素是否可用或捕获任何异常。您只需直接获取页数,然后多次单击“下一步”按钮。

能否在for循环之前打印页数?我怀疑,因为您将最后一个元素的文本转换为int,它只显示了10页(即使有更多的页面),我刚刚测试了你的右侧,它只将10变成int,其他页面不会按照你给定的链接进行[URL-。我只看到了10页。你是否检查了7月份?如果你按第10页,应该会出现更多页面。你能在for循环之前打印页面数吗?我怀疑,因为你将最后一个元素的文本转换为int,它只显示了10页(即使有更多页面)我刚刚测试了一下你的右边,它只会把10变成int,其他页面不会按照你给定的链接进行[URL-。我只看到了10页。如果按第10页,您是否检查了7月份,应该会出现更多页面?您的意思是:下一个按钮元素=驱动程序。通过xpath查找元素('//a[contains(concat(“,normalize space(@class),”),“next”)])当下一个按钮显示时():driver.find通过xpath('//a[contains(concat(“,normalize space(class),”“),“next”)))。单击()数据。扩展(getData(driver))时间。睡眠(1)