Python 2.7 使用Selenium从站点查找并获取元素

Python 2.7 使用Selenium从站点查找并获取元素,python-2.7,selenium,web-scraping,Python 2.7,Selenium,Web Scraping,我需要你的帮助,我有一个网站,我必须从这个网站获得信息。网站示例: 我必须从类输入字段获取数据,但我必须对数据进行排序,例如:如果类键是工作类型我们将类输入字段的数据写入var1,如果classkey是Application No.我们将classinputField中的数据写入var2,如果classkey是提交日期我们将classinputField中的数据写入var3。 代码: 但是我不知道如何从Java中尝试的inputField中获取数据。您可以在python中使用相同的方法 您可以

我需要你的帮助,我有一个网站,我必须从这个网站获得信息。网站示例:

我必须从
输入字段
获取数据,但我必须对数据进行排序,例如:如果
工作类型
我们将
输入字段的数据写入
var1
,如果
class
key
Application No.
我们将
class
inputField
中的数据写入
var2
,如果
class
key是
提交日期
我们将
class
inputField中的数据写入
var3
。 代码:


但是我不知道如何从Java中尝试的
inputField

中获取数据。您可以在python中使用相同的方法

您可以使用class=keyclass=inputField获取所有span元素。 迭代这些内容并获取感兴趣的信息

public static void main(String args[]){
        WebDriver driver = new FirefoxDriver();

        driver.get("https://ecouncil.bayside.vic.gov.au/eservice/daEnquiryInit.do?docType=5&nodeNum=1118");
        driver.get("https://ecouncil.bayside.vic.gov.au/eservice/daEnquiry.do?number=&lodgeRangeType=on&dateFrom=01%2F09%2F2017&dateTo=30%2F09%2F2017&detDateFromString=&detDateToString=&streetName=&suburb=0&unitNum=&houseNum=0%0D%0A%09%09%09%09%09&planNumber=&strataPlan=&lotNumber=&propertyName=&searchMode=A&submitButton=Search");

        List<WebElement> keys = driver.findElements(By.xpath("//span[@class='key']"));
        List<WebElement> inputFields = driver.findElements(By.xpath("//span[@class='inputField']"));

        String var1, var2, var3;

        for (int j = 0; j < keys.size(); j++) {
            WebElement key = keys.get(j);
            System.out.println("key: " + key.getText());
            System.out.println("inputField: " + inputFields.get(j).getText());
            if (key.getText().equalsIgnoreCase("Type of Work")) {
                var1 = inputFields.get(j).getText();
                System.out.println("var1: " + var1);
            } else if (key.getText().equalsIgnoreCase("Application No.")) {
                var2 = inputFields.get(j).getText();
                System.out.println("var2: " + var2);
            } else if (key.getText().equalsIgnoreCase("Date Lodged")) {
                var3 = inputFields.get(j).getText();
                System.out.println("var3: " + var3);
            }

            System.out.println("------------------" + j + "------------------");

        }

    }
publicstaticvoidmain(字符串参数[]){
WebDriver=newfirefoxdriver();
驱动程序。获取(“https://ecouncil.bayside.vic.gov.au/eservice/daEnquiryInit.do?docType=5&nodeNum=1118");
驱动程序。获取(“https://ecouncil.bayside.vic.gov.au/eservice/daEnquiry.do?number=&lodgeRangeType=on&dateFrom=01%2F09%2F2017&dateTo=30%2F09%2F2017&detDateFromString=&detDateToString=&streetName=&suburb=0&unitNum=&houseNum=0%0D%0A%09%09%09%09%09&planNumber=&strataPlan=&lotNumber=&propertyName=&searchMode=A&submitButton=Search");
List key=driver.findElements(By.xpath(//span[@class='key']);
List inputFields=driver.findElements(By.xpath(//span[@class='inputField']);
字符串var1、var2、var3;
对于(int j=0;j
您当前的逻辑无法正常工作。您要做的是获得属性数量的计数,然后循环遍历每个属性。当您循环遍历每一个变量时,您会获取您感兴趣的三个项目,并将它们存储在三个变量中(顺便说一句,您确实应该使用更具描述性的名称)

像下面这样的东西应该可以

class MySpider(scrapy.Spider):
    title = []
    type = []
    name = 'Spider'
    allowed_domains = ['https://ecouncil.bayside.vic.gov.au/']

    driver = webdriver.Chrome('C:/TEMP/Scrapy/chromedriver')

    driver.get('https://ecouncil.bayside.vic.gov.au/eservice/daEnquiryInit.do?docType=5&nodeNum=1118')
    driver.get('https://ecouncil.bayside.vic.gov.au/eservice/daEnquiry.do?number=&lodgeRangeType=on&dateFrom=01%2F09%2F2017&dateTo=30%2F09%2F2017&detDateFromString=&detDateToString=&streetName=&suburb=0&unitNum=&houseNum=0%0D%0A%09%09%09%09%09&planNumber=&strataPlan=&lotNumber=&propertyName=&searchMode=A&submitButton=Search')

    titles = driver.find_elements_by_css_selector('a.plain_header')
    for i in range(0, len(titles) - 1):
        var1 = driver.find_elements_by_xpath("//span[@class='key'][.='Type of Work']/following-sibling::span[@class='inputField']")[i].text
        var2 = driver.find_elements_by_xpath("//span[@class='key'][.='Application No.']/following-sibling::span[@class='inputField']")[i].text
        var3 = driver.find_elements_by_xpath("//span[@class='key'][.='Date Lodged']/following-sibling::span[@class='inputField']")[i].text

为了便于维护(和阅读),您可以将最后三行中的代码转换为一个函数,在该函数中输入字段名,例如Date Attributed,然后返回字段值,例如01/09/2017。我将把它作为练习留给你。

请阅读原因。粘贴代码并将其正确格式化。抱歉,我在范围(0,titles.count-1)中得到了I的
类型错误:不支持的操作数类型-:“内置函数”或“方法”和“int”
我使用的是
。count
而不是
len()
。。。错误的语言。现在试试
class MySpider(scrapy.Spider):
    title = []
    type = []
    name = 'Spider'
    allowed_domains = ['https://ecouncil.bayside.vic.gov.au/']

    driver = webdriver.Chrome('C:/TEMP/Scrapy/chromedriver')

    driver.get('https://ecouncil.bayside.vic.gov.au/eservice/daEnquiryInit.do?docType=5&nodeNum=1118')
    driver.get('https://ecouncil.bayside.vic.gov.au/eservice/daEnquiry.do?number=&lodgeRangeType=on&dateFrom=01%2F09%2F2017&dateTo=30%2F09%2F2017&detDateFromString=&detDateToString=&streetName=&suburb=0&unitNum=&houseNum=0%0D%0A%09%09%09%09%09&planNumber=&strataPlan=&lotNumber=&propertyName=&searchMode=A&submitButton=Search')

    titles = driver.find_elements_by_css_selector('a.plain_header')
    for i in range(0, len(titles) - 1):
        var1 = driver.find_elements_by_xpath("//span[@class='key'][.='Type of Work']/following-sibling::span[@class='inputField']")[i].text
        var2 = driver.find_elements_by_xpath("//span[@class='key'][.='Application No.']/following-sibling::span[@class='inputField']")[i].text
        var3 = driver.find_elements_by_xpath("//span[@class='key'][.='Date Lodged']/following-sibling::span[@class='inputField']")[i].text