Java 如何使用Selenium从表中提取HTML链接?
我使用Java和Selenium使用以下代码:Java 如何使用Selenium从表中提取HTML链接?,java,html,parsing,selenium,web,Java,Html,Parsing,Selenium,Web,我使用Java和Selenium使用以下代码: public static void main(String[] args){ WebDriver driver; DesiredCapabilities caps; caps = new DesiredCapabilities(); caps.setCapability(PhantomJSDriverService.PHANTOMJS_EXECUTABLE_PATH_PROPERTY, "
public static void main(String[] args){
WebDriver driver;
DesiredCapabilities caps;
caps = new DesiredCapabilities();
caps.setCapability(PhantomJSDriverService.PHANTOMJS_EXECUTABLE_PATH_PROPERTY,
"lib/phantomjs.exe");
caps.setBrowserName(DesiredCapabilities.phantomjs().getBrowserName());
driver = new PhantomJSDriver(caps);
driver.manage().timeouts().implicitlyWait(20, TimeUnit.SECONDS);
driver.get("https://www.cdp.net/en-US/Pages/CDPAdvancedSearchResults.aspx?k=microsoft");
WebElement element = driver.findElement(By.className("ms-vb2"));
String text = element.getText();
String href = element.getAttribute("href");
driver.manage().deleteAllCookies();
driver.quit();
System.out.println(text + " " + href);
}
我正在使用代码尝试的页面的特定部分包含以下内容。我正试图从类ms-vb2
中提取href,即https://www.cdp.net/en-US/Results/Pages/Company-Responses.aspx?company=11930
:
<td class="ms-vb2"><a href="https://www.cdp.net/en-US/Results/Pages/Company-Responses.aspx?company=11930">Microsoft Corporation</a><br/>USA</td>
美国
我收到了文本,但没有收到href。如何提取它?driver.findElement(By.className(“ms-vb2”))
将实际匹配td
元素:
<td class="ms-vb2"><a href="https://www.cdp.net/en-US/Results/Pages/Company-Responses.aspx?company=11930">Microsoft Corporation</a><br>USA</td>
在这里,我们直接在具有ms-vb2
类的元素中搜索a
元素
driver.findElement(By.cssSelector(".ms-vb2 > a"))