Python I'；我试图用无限滚动来刮一个网站_Python_Selenium_Web Scraping_Infinite Scroll_Rselenium

Python I'；我试图用无限滚动来刮一个网站

python selenium web-scraping

Python I'；我试图用无限滚动来刮一个网站,python,selenium,web-scraping,infinite-scroll,rselenium,Python,Selenium,Web Scraping,Infinite Scroll,Rselenium,这是我在R中尝试过的，但我无法无限滚动这是在Pyton中使用Selenium包了解无限滚动的第一步。我对Python编码相当在行，但仍然尝试了参考文章中的一些编辑这是在R library(rvest) uuu_df2 <- data.frame(x = c('http://www.magicbricks.com/property-for- sale/residential-real-estate?bedroom=1&proptype=Multistorey-Apartmen

这是我在

中尝试过的，但我无法无限滚动

这是在Pyton中使用Selenium包了解无限滚动的第一步。我对Python编码相当在行，但仍然尝试了参考文章中的一些编辑

这是在

library(rvest)
 uuu_df2 <- data.frame(x = c('http://www.magicbricks.com/property-for-
 sale/residential-real-estate?bedroom=1&proptype=Multistorey-Apartment,Builder-
 Floor-Apartment,Penthouse,Studio-Apartment&cityName=Thane&BudgetMin=5-
 Lacs&BudgetMax=5-Lacs',
                            'http://www.magicbricks.com/property-for-sale/residential-real-estate?bedroom=1&proptype=Multistorey-Apartment,Builder-Floor-Apartment,Penthouse,Studio-Apartment&cityName=Thane&BudgetMin=5-Lacs&BudgetMax=10-Lacs',
'http://www.magicbricks.com/property-for-sale/residential-real-estate?bedroom=1&proptype=Multistorey-Apartment,Builder-Floor-Apartment,Penthouse,Studio-Apartment&cityName=Thane&BudgetMin=5-Lacs&BudgetMax=10-Lacs'))

    urlList <- llply(uuu_df2[,1], function(url){     

      this_pg <- read_html(url)

      results_count <- this_pg %>% 
        xml_find_first(".//span[@id='resultCount']") %>% 
        xml_text() %>%
        as.integer()

      if(!is.na(results_count) & (results_count > 0)){

        cards <- this_pg %>% 
          xml_find_all('//div[@class="SRCard"]')

        df <- ldply(cards, .fun=function(x){
          y <- data.frame(wine = x %>% xml_find_first('.//span[@class="agentNameh"]') %>% xml_text(),
                          excerpt = x %>% xml_find_first('.//div[@class="postedOn"]') %>% xml_text(),
                          locality = x %>% xml_find_first('.//span[@class="localityFirst"]') %>% xml_text(),
                          society = x %>% xml_find_first('.//div[@class="labValu"]') %>% xml_text() %>% gsub('\\n', '', .))
          return(y)
        })

      } else {
        df <- NULL
      }

      return(df)   
    }, .progress = 'text')
    names(urlList) <- uuu_df2[,1]

但它给了我一个错误：

execfile(filename, namespace)
  File "C:\Users\user\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 102, in execfile
    exec(compile(f.read(), filename, 'exec'), namespace)
  File "D:/Deepesh/All files/test_forCSVData.py", line 27
    self.driver.execute_script(".//span[@class="agentNameh;")

任何关于我的Python/R代码中应该进行哪些编辑的建议，以便它可以无限滚动。任何帮助都将不胜感激。

self.driver.execute\u script（“.//span[@class=“agentNameh；””）

此函数中的参数有一些不正确的地方。您有三个

”

在这里，我怀疑您丢失了一个反斜杠来转义中间的斜杠（或传递一个原始字符串）。或者您正在尝试连接？在这种情况下，您丢失了一个

（和另一个

“

）您认为这是什么

”//span[@class=agentNameh；“

”JavaScript“应该做什么？你期望什么？当然，如果你像

”//span[@class='agentNameh']；“

或

”//span[@class=agentNameh]”这样正确地编写它。”

或其他…@Andersson应该给我代理的名称，它在属性photo下。@deepesh，如果你想用

JavaScript

获取元素的文本值，你应该使用类似

'return document.querySelector（“span.agentName”）.childNodes[0]。textContent'

给了我错误：exec（compile）（f.read（），filename，'exec'），namespace）文件“D:/Deepesh/All files/test_forCSVData.py”，第27行返回document.querySelector（“span.agentName”）.childNodes[0]。textC‌内容^SyntaxError:标识符中的字符无效

execfile(filename, namespace)
  File "C:\Users\user\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 102, in execfile
    exec(compile(f.read(), filename, 'exec'), namespace)
  File "D:/Deepesh/All files/test_forCSVData.py", line 27
    self.driver.execute_script(".//span[@class="agentNameh;")