Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/287.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/xpath/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
使用python和xpath选择多个值_Python_Xpath_Lxml - Fatal编程技术网

使用python和xpath选择多个值

使用python和xpath选择多个值,python,xpath,lxml,Python,Xpath,Lxml,我可以毫无问题地在python中使用xpath选择单个值,但是如何将几个单个xpath连接起来以获得一个值呢? 以下是html源r.content的示例片段: 第一个xpath: acc=webContent.xpath//span/a[contains@href,“帐户/查看配置文件”]/text 结果: ['konterA','mardok'] 第二个xpath: twitch=webContent.xpath'//span/@data twith user' 结果: ['mardok_tv

我可以毫无问题地在python中使用xpath选择单个值,但是如何将几个单个xpath连接起来以获得一个值呢? 以下是html源r.content的示例片段:

第一个xpath: acc=webContent.xpath//span/a[contains@href,“帐户/查看配置文件”]/text 结果: ['konterA','mardok']

第二个xpath: twitch=webContent.xpath'//span/@data twith user' 结果: ['mardok_tv']

第三个xpath: lastOnline=webContent.xpath'//span/@data-time' 结果: [2017-02-20T22:37:42Z','2017-02-19T11:28:20Z']

如何将这三者结合起来,得到如下结果: [konterA','2017-02-20T22:37:42Z',[mardok','mardok_tv','2017-02-19T11:28:20Z'.

让我们把它们称为第一个列表,第二个列表和第三个列表。 将第二个列表修改为:

之后,请执行以下操作:

 zip(first_list, second_list, third_list)
这将以同样的方式为您提供一个元组列表

[('konterA','','2017-02-20T22:37:42Z'),('mardok','mardok_tv','2017-02-19T11:28:20Z')]

考虑在同一父级下一起解析所有项,在顶级xpath上迭代。如果不存在attrib/元素值,则使用xpath的concat返回空长度字符串。下面还使用xpath的normalize空格从值中删除换行符和回车符

# PARSING POSTED SNIPPET AS STRING
webContent = html.fromstring(htmlstr)

# INITIALIZING LISTS
acc = []; twitch = []; lastOnline = []

# ITERATING THROUGH SECOND CHILD <SPAN>
for i in webContent.xpath("//span/span[1]"):    
    acc.append(i.xpath("concat(normalize-space(a[contains(@href,'account/view-profile')]),'')"))
    twitch.append(i.xpath("concat(@data-twitch-user, '')"))
    lastOnline.append(i.xpath("concat(../@data-time, '')"))

# ZIP EQUAL LENGTH LISTS
xpath_list = list(zip(acc, twitch, lastOnline))

print(xpath_list)
# [('KonterA', '', '2017-02-20T22:37:42Z'), ('mardok', 'mardok_tv', '2017-02-19T11:28:20Z')]

我不能简单地将列表连接在一起,因为这个列表上的值完全不同,不仅仅是“\u tv”part@mastaBot,那么你怎么知道这个词应该放在哪里呢?例如,如果有foo代替了mardok_tv,那么输出应该是什么呢?我不知道,然后我需要像@Parfait例子中那样做。非常感谢你,这正是我需要的我试着从很多方面获得它,但是我从来没有考虑过迭代查找。谢谢!实际上,你不需要迭代查找,因为XPath会这样做。请看更新。当然是……时刻之一。Iterfind来自内置的EtRE,但是遍历了像LXML的XPath这样的元素。
 zip(first_list, second_list, third_list)
[('konterA','','2017-02-20T22:37:42Z'),('mardok','mardok_tv','2017-02-19T11:28:20Z')]
# PARSING POSTED SNIPPET AS STRING
webContent = html.fromstring(htmlstr)

# INITIALIZING LISTS
acc = []; twitch = []; lastOnline = []

# ITERATING THROUGH SECOND CHILD <SPAN>
for i in webContent.xpath("//span/span[1]"):    
    acc.append(i.xpath("concat(normalize-space(a[contains(@href,'account/view-profile')]),'')"))
    twitch.append(i.xpath("concat(@data-twitch-user, '')"))
    lastOnline.append(i.xpath("concat(../@data-time, '')"))

# ZIP EQUAL LENGTH LISTS
xpath_list = list(zip(acc, twitch, lastOnline))

print(xpath_list)
# [('KonterA', '', '2017-02-20T22:37:42Z'), ('mardok', 'mardok_tv', '2017-02-19T11:28:20Z')]