在python selenium中按regexp单击链接_Python_Selenium

在python selenium中按regexp单击链接

python selenium

在python selenium中按regexp单击链接,python,selenium,Python,Selenium,我一直在四处寻找，试图找到一种方法来点击selenium中与regexp匹配的链接以下是有效的代码 from selenium import selenium sel = selenium("localhost", 4444, "*chrome", "http://www.ncbi.nlm.nih.gov/") sel.start() sel.open('/pubmed') sel.type("search_term", "20032207[uid]") sel.click("search")

我一直在四处寻找，试图找到一种方法来点击selenium中与regexp匹配的链接

以下是有效的代码

from selenium import selenium
sel = selenium("localhost", 4444, "*chrome", "http://www.ncbi.nlm.nih.gov/")
sel.start()
sel.open('/pubmed')
sel.type("search_term", "20032207[uid]")
sel.click("search")
sel.click("linkout-icon-unknown-vir_full")

但是，如果我在不同ID之间搜索，链接文本将不同，但它始终与regexp

链接图标[\w-\uz]*

匹配

但我似乎找不到正确的命令来单击与regexp匹配的链接。。。我试过：

sel.click('link=regex:linkout-icon[\w-_]*')
sel.click('regex:linkout-icon[\w-_]*')
sel.click('link=regexp:linkout-icon[\w-_]*')
sel.click('regexp:linkout-icon[\w-_]*')

但它们似乎都不起作用。有什么建议吗

编辑：

因此，在下面的回答中给出评论之后：单击的项目实际上是一个id=linkout-icon-unknown-viro_full的图像。全文如下：

<a href="http://vir.sgmjournals.org/cgi/pmidlookup?view=long&amp;pmid=20032207" ref="PrId=3051&amp;itool=Abstract-def&amp;uid=20032207&amp;nlmid=0077340&amp;db=pubmed&amp;log$=linkouticon" target="_blank"><img alt="Click here to read" id="linkout-icon-unknown-vir_full" border="0" src="http://www.ncbi.nlm.nih.gov/corehtml/query/egifs/http:--highwire.stanford.edu-icons-externalservices-pubmed-standard-vir_full.gif" /></a> </div>

如果你想知道，我从Selenium IDE记录器中得到了代码。

我想你很接近了。首先，

regexp:

是表示希望使用正则表达式的正确文本模式

另一件可能不太正确的事情是说

link=

，因为它指的是链接的文本，即：

<a href="path/to/mylink">Text of the link, this is what will be searched</a>

那么，您希望在锚点的哪个部分使用正则表达式，href

可能导致正确答案的是：

也许可以重新使用get函数来搜索regexp的所有

a.href

属性，然后返回每个属性的XPath，然后将其馈送到

click（）

sel。click

可以将XPath作为参数。使用我找到（我相信是）指向“linkout-icon-unknown-vir_full”链接的XPath：

使用上面的命令将使我转到

我无法使

匹配项

正常工作--我不知道为什么--但这似乎可以使用：

在进行了一些黑客攻击之后，我想出了可能是最愚蠢的方法，但在有人能给我提供更好的答案之前，它是有效的：

import re
val = re.findall('linkout-icon-unknown[\w-]*', sel.get_html_source())[0]
sel.click(val)

它要求我搜索整个html，如果设计发生变化，可能会出现问题

我希望看到一个更健壮的方法。

好主意，但我需要一个regexp，因为我将从搜索列表中获取这些。对于不同的搜索，我需要匹配不同的链接。

sel = selenium.selenium("localhost", 4444, "*firefox", "http://www.ncbi.nlm.nih.gov/")
sel.start()
sel.open('/pubmed')
sel.type("search_term", "20032207[uid]")
sel.click("search")
sel.wait_for_page_to_load(30000)
sel.click("//*[contains(@id,'linkout')]")

import re
val = re.findall('linkout-icon-unknown[\w-]*', sel.get_html_source())[0]
sel.click(val)