Python 当<;a>;使用XPath包装另一个元素?

Python 当<;a>;使用XPath包装另一个元素?,python,html,xml,xpath,Python,Html,Xml,Xpath,下面的示例反映的数据与我正在使用的数据类似(由于公司政策,我无法显示实时数据)。它是从答案和答案中提取出来的 我的目标是提取元素的文本以及链接本身 from lxml import html post1 = """<p><code>Integer.parseInt</code> <em>couldn't</em> do the job, unless you were happy to lose data. Think about w

下面的示例反映的数据与我正在使用的数据类似(由于公司政策,我无法显示实时数据)。它是从答案和答案中提取出来的

我的目标是提取
元素的文本以及链接本身

from lxml import html

post1 = """<p><code>Integer.parseInt</code> <em>couldn't</em> do the job, unless you were happy to lose data. Think about what you're asking for here.</p>&#xA;&#xA;<p>Try <a href="http://docs.oracle.com/javase/7/docs/api/java/lang/Long.html#parseLong%28java.lang.String%29"><code>Long.parseLong(String)</code></a> or <a href="http://docs.oracle.com/javase/7/docs/api/java/math/BigInteger.html#BigInteger%28java.lang.String%29"><code>new BigInteger(String)</code></a> for really big integers.</p>&#xA;
"""

post2 = """
<p><code>Integer.parseInt</code> <em>couldn't</em> do the job, unless you were happy to lose data. Think about what you're asking for here.</p>&#xA;&#xA;<p>Try <a href="http://docs.oracle.com/javase/7/docs/api/java/lang/Long.html#parseLong%28java.lang.String%29"><code>Long.parseLong(String)</code></a> or <a href="http://docs.oracle.com/javase/7/docs/api/java/math/BigInteger.html#BigInteger%28java.lang.String%29"><code>new BigInteger(String)</code></a> for really big integers.</p>&#xA;
"""
doc = html.fromstring(post1)
for link in doc.xpath('//a'):
    print link.text, link.get('href')
不幸的是,这将返回以下结果:

None http://docs.oracle.com/javase/7/docs/api/java/lang/Long.html#parseLong%28java.lang.String%29
None http://docs.oracle.com/javase/7/docs/api/java/math/BigInteger.html#BigInteger%28java.lang.String%29
请注意,我的
link.text
为空。这是因为链接包装了一个

如果我使用
post2
,它将返回正确的结果:

PROJ.4 http://trac.osgeo.org/proj/
OpenSceneGraph http://www.openscenegraph.org/
如何修改循环以处理标准URL(
post2
)和包装另一个对象的链接(
post1

print link.text, link.get('href')

然后您的输出将是

Long.parseLong(String) http://docs.oracle.com/javase/7/docs/api/java/lang/Long.html#parseLong%28java.lang.String%29
new BigInteger(String) http://docs.oracle.com/javase/7/docs/api/java/math/BigInteger.html#BigInteger%28java.lang.String%29
根据要求,用于
post1
post2

Long.parseLong(String) http://docs.oracle.com/javase/7/docs/api/java/lang/Long.html#parseLong%28java.lang.String%29
new BigInteger(String) http://docs.oracle.com/javase/7/docs/api/java/math/BigInteger.html#BigInteger%28java.lang.String%29