Warning: file_get_contents(/data/phpspider/zhask/data//catemap/7/python-2.7/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 2.7 Python:lxml-xpath用于提取内容_Python 2.7_Lxml_Lxml.html - Fatal编程技术网

Python 2.7 Python:lxml-xpath用于提取内容

Python 2.7 Python:lxml-xpath用于提取内容,python-2.7,lxml,lxml.html,Python 2.7,Lxml,Lxml.html,下面的代码可以从下面的路透社链接中提取PE。然而,我的方法并不可靠,因为另一只股票的网页少了两行,导致数据移动。我怎么会遇到这个问题。我想直接指出PE中提取数据的部分,但不知道如何做。 链接1: 链接2: 这是我希望代码只提取这一部分,这样网页的任何更改都不会受到影响 <tr class="stripe"> <td>P/E Ratio (TTM)</td> <td class="data"

下面的代码可以从下面的路透社链接中提取PE。然而,我的方法并不可靠,因为另一只股票的网页少了两行,导致数据移动。我怎么会遇到这个问题。我想直接指出PE中提取数据的部分,但不知道如何做。 链接1: 链接2:

这是我希望代码只提取这一部分,这样网页的任何更改都不会受到影响

 <tr class="stripe">
                <td>P/E Ratio (TTM)</td>
                <td class="data">36.79</td>
                <td class="data">25.99</td>
                <td class="data">21.70</td>
            </tr>

市盈率(TTM)
36.79
25.99
21.70

使用文本查找第一个td,然后提取同级td:

不管怎样,这都是可行的:

In [8]: page2 = requests.get('http://www.reuters.com/finance/stocks/financialHighlights?symbol=MYEG.KL')

In [9]: treea = html.fromstring(page2.content)    
In [10]: tree4 = treea.xpath('//td[contains(.,"P/E Ratio")]/following-sibling::td/text()')

In [11]: print(tree4)
['36.79', '25.99', '21.41']

In [12]: page2 = requests.get('http://www.reuters.com/finance/stocks/financialHighlights?symbol=ANNJ.KL')
In [13]: treea = html.fromstring(page2.content)

In [14]: tree4 = treea.xpath('//td[contains(.,"P/E Ratio")]/following-sibling::td/text()')

In [15]: print(tree4)
['--', '25.49', '17.30']
 treea.xpath('//td[contains(.,"P/E Ratio")]/following-sibling::td/text()')
In [8]: page2 = requests.get('http://www.reuters.com/finance/stocks/financialHighlights?symbol=MYEG.KL')

In [9]: treea = html.fromstring(page2.content)    
In [10]: tree4 = treea.xpath('//td[contains(.,"P/E Ratio")]/following-sibling::td/text()')

In [11]: print(tree4)
['36.79', '25.99', '21.41']

In [12]: page2 = requests.get('http://www.reuters.com/finance/stocks/financialHighlights?symbol=ANNJ.KL')
In [13]: treea = html.fromstring(page2.content)

In [14]: tree4 = treea.xpath('//td[contains(.,"P/E Ratio")]/following-sibling::td/text()')

In [15]: print(tree4)
['--', '25.49', '17.30']