Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/xpath/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 没有嵌套节点。如何获取一条信息,然后分别获取附加信息?_Python_Xpath_Lxml - Fatal编程技术网

Python 没有嵌套节点。如何获取一条信息,然后分别获取附加信息?

Python 没有嵌套节点。如何获取一条信息,然后分别获取附加信息?,python,xpath,lxml,Python,Xpath,Lxml,对于下面的代码,我需要分别获取日期和时间+hrefs+格式+…(未显示) <div class="showtimes"> <h2>The Little Prince</h2> <div class="poster" data-poster-url="http://www.test.com"> <img src="http://www.test.com"> </div> &l

对于下面的代码,我需要分别获取日期和时间+hrefs+格式+…(未显示)

<div class="showtimes">
    <h2>The Little Prince</h2>

    <div class="poster" data-poster-url="http://www.test.com">
        <img src="http://www.test.com">
    </div>

    <div class="showstimes">

        <div class="date">9 December, Wednesday</div>
        <span class="show-time techno-3d">
            <a href="http://www.test.com" class="link">12:30</a>
            <span class="show-format">3D</span>
        </span>

        <span class="show-time techno-3d">
            <a href="http://www.test.com" class="link">15:30</a>
            <span class="show-format">3D</span>
        </span>

        <span class="show-time techno-3d">
            <a href="http://www.test.com" class="link">18:30</a>
            <span class="show-format">3D</span>
        </span>


        <div class="date">10 December, Thursday</div>
        <span class="show-time techno-2d">
            <a href="http://www.test.com" class="link">12:30</a>
            <span class="show-format">2D</span>         
        </span>

        <span class="show-time techno-3d">
            <a href="http://www.test.com" class="link">15:30</a>
            <span class="show-format">3D</span>
        </span>
    </div>
</div>
获取日期不是问题,但我有一个问题,即如何分别获取特定日期的其余信息。尝试了许多不同的方法-没有运气(在一些评论中)。当我需要的节点一个接一个(在同一级别?)时,我找不到如何处理这种情况的方法。在这种情况下:

-> div Date1
-> span Time1
-> span href1
-> span Format1

-> span Time2
-> span href2
-> span Format2

-> span Time3
-> span href3
-> span Format3

-> div Date2
-> span Time1
-> span href1
-> span Format1
# etc etc

事实证明,
lxml
支持引用XPath表达式中的python变量,这在本例中非常有用,即对于每个
div日期
,您可以得到以下同级
span
,其中最近的前同级
div日期
是当前的
div日期
,其中,对当前
div date
的引用存储在python变量
dates
中:

for dates in movie.xpath('.//div[@class="showstimes"]/div[@class="date"]'):
    date = dates.xpath('normalize-space()')
    for times in dates.xpath('following-sibling::span[preceding-sibling::div[1]=$current]', current=dates):
        time = times.xpath('a/text()')[0]
        url = times.xpath('a/@href')[0]
        format_type = times.xpath('span/text()')[0]
        print date, time, url, format_type
输出:

'9 December, Wednesday', '12:30', 'http://www.test.com', '3D'
'9 December, Wednesday', '15:30', 'http://www.test.com', '3D'
'9 December, Wednesday', '18:30', 'http://www.test.com', '3D'
'10 December, Thursday', '12:30', 'http://www.test.com', '2D'
'10 December, Thursday', '15:30', 'http://www.test.com', '3D'
参考资料:


不知道lxml中的变量特性。谢谢我也没有,我只是幸运地发现了这件事。不客气:)
'9 December, Wednesday', '12:30', 'http://www.test.com', '3D'
'9 December, Wednesday', '15:30', 'http://www.test.com', '3D'
'9 December, Wednesday', '18:30', 'http://www.test.com', '3D'
'10 December, Thursday', '12:30', 'http://www.test.com', '2D'
'10 December, Thursday', '15:30', 'http://www.test.com', '3D'