Python 使用selenium和phantomjs的quora爬虫，可以'；无法获取统计信息_Python_Selenium_Xpath

Python 使用selenium和phantomjs的quora爬虫，可以'；无法获取统计信息

python selenium xpath

Python 使用selenium和phantomjs的quora爬虫，可以'；无法获取统计信息,python,selenium,xpath,Python,Selenium,Xpath,我正在尝试使用python通过selenium和phantomjs对quora进行爬虫。我使用以下代码对问题页面进行爬网： # coding=utf-8 # Created by lruoran on 17-1-29 from selenium import webdriver from selenium.common.exceptions import NoSuchElementException, TimeoutException from selenium.webdriver.supp

我正在尝试使用

python

通过

selenium

和

phantomjs

对quora进行爬虫。我使用以下代码对问题页面进行爬网：

# coding=utf-8

# Created by lruoran on 17-1-29

from selenium import webdriver
from selenium.common.exceptions import NoSuchElementException, TimeoutException
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By

driver = webdriver.PhantomJS(r'/home/lruoran/software/phantomjs/bin/phantomjs')
driver.get('https://www.quora.com/Who-is-Roger-Federer')
cnt_answers = driver.find_element_by_xpath("//div[@class='answer_count']").text.encode('utf-8').strip().split()[0]
if cnt_answers[-1].isdigit():
    cnt_answers = int(cnt_answers)
else:
    cnt_answers = int(cnt_answers[:-1])
print("problem:{:s}".format(driver.find_element_by_xpath("//h1").text.encode('utf-8')))
print('the number of answers:{:d}'.format(cnt_answers))
try:
    print("the number of follow:{:s}".format(
        driver.find_element_by_xpath(r"//a[contains(@class,'FollowerListModalLink')]").text.encode('utf-8')))
except TimeoutException:
    pass

但是，使用

xpath

：

//a[contains（@class，'FollowerListModalLink'）]

。我无法得到的追随者人数。但是我用

xpath

助手测试

xpath

，它可以成功地找到元素

xpath助手的结果如图所示-

我可以在

Firefox

中看到关注者的数量，但是

Chrome

上没有显示

问题统计数据。同样的问题也可能出现在PhantomJS
中，我已经自己解决了这个问题。某些元素只有在登录时才能看到。因此，您应该首先登录以获取问题页面中的所有元素。