Python 当<;a>;使用XPath包装另一个元素?
下面的示例反映的数据与我正在使用的数据类似(由于公司政策,我无法显示实时数据)。它是从答案和答案中提取出来的 我的目标是提取Python 当<;a>;使用XPath包装另一个元素?,python,html,xml,xpath,Python,Html,Xml,Xpath,下面的示例反映的数据与我正在使用的数据类似(由于公司政策,我无法显示实时数据)。它是从答案和答案中提取出来的 我的目标是提取元素的文本以及链接本身 from lxml import html post1 = """<p><code>Integer.parseInt</code> <em>couldn't</em> do the job, unless you were happy to lose data. Think about w
元素的文本以及链接本身
from lxml import html
post1 = """<p><code>Integer.parseInt</code> <em>couldn't</em> do the job, unless you were happy to lose data. Think about what you're asking for here.</p>

<p>Try <a href="http://docs.oracle.com/javase/7/docs/api/java/lang/Long.html#parseLong%28java.lang.String%29"><code>Long.parseLong(String)</code></a> or <a href="http://docs.oracle.com/javase/7/docs/api/java/math/BigInteger.html#BigInteger%28java.lang.String%29"><code>new BigInteger(String)</code></a> for really big integers.</p>

"""
post2 = """
<p><code>Integer.parseInt</code> <em>couldn't</em> do the job, unless you were happy to lose data. Think about what you're asking for here.</p>

<p>Try <a href="http://docs.oracle.com/javase/7/docs/api/java/lang/Long.html#parseLong%28java.lang.String%29"><code>Long.parseLong(String)</code></a> or <a href="http://docs.oracle.com/javase/7/docs/api/java/math/BigInteger.html#BigInteger%28java.lang.String%29"><code>new BigInteger(String)</code></a> for really big integers.</p>

"""
doc = html.fromstring(post1)
for link in doc.xpath('//a'):
print link.text, link.get('href')
不幸的是,这将返回以下结果:
None http://docs.oracle.com/javase/7/docs/api/java/lang/Long.html#parseLong%28java.lang.String%29
None http://docs.oracle.com/javase/7/docs/api/java/math/BigInteger.html#BigInteger%28java.lang.String%29
请注意,我的link.text
为空。这是因为链接包装了一个
块
如果我使用post2
,它将返回正确的结果:
PROJ.4 http://trac.osgeo.org/proj/
OpenSceneGraph http://www.openscenegraph.org/
如何修改循环以处理标准URL(post2
)和包装另一个对象的链接(post1
)
print link.text, link.get('href')
到
然后您的输出将是
Long.parseLong(String) http://docs.oracle.com/javase/7/docs/api/java/lang/Long.html#parseLong%28java.lang.String%29
new BigInteger(String) http://docs.oracle.com/javase/7/docs/api/java/math/BigInteger.html#BigInteger%28java.lang.String%29
根据要求,用于post1
和post2
Long.parseLong(String) http://docs.oracle.com/javase/7/docs/api/java/lang/Long.html#parseLong%28java.lang.String%29
new BigInteger(String) http://docs.oracle.com/javase/7/docs/api/java/math/BigInteger.html#BigInteger%28java.lang.String%29