Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/312.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
python、lxml检索列表中的所有元素_Python_Html_Web Scraping_Lxml - Fatal编程技术网

python、lxml检索列表中的所有元素

python、lxml检索列表中的所有元素,python,html,web-scraping,lxml,Python,Html,Web Scraping,Lxml,我正在尝试从网站获取列表中的所有元素 从以下html代码段: <ul> <li class="name"> James </li> <li> Male </li> <li> 5'8" </li> </ul> 印刷品 [' James ', ' Male ', ' 5\'8" '] XPath“//li[../li[@class=“name”and position()=1

我正在尝试从网站获取列表中的所有元素

从以下html代码段:

<ul>
    <li class="name"> James </li>
    <li> Male </li>
    <li> 5'8" </li>
</ul>
印刷品

[' James ', ' Male ', ' 5\'8" ']

XPath
“//li[../li[@class=“name”and position()=1]]/text()”
表示

//li             # all li elements
[                # whose
..               # parent
/                # has a child 
li               # li element
  [              # whose
   @class="name" # class attribute equals "name"
   and           # and 
   position()=1] # which is the first child element
  ]               
  /text()        # return the text of those elements 
[' James ', ' Male ', ' 5\'8" ']
//li             # all li elements
[                # whose
..               # parent
/                # has a child 
li               # li element
  [              # whose
   @class="name" # class attribute equals "name"
   and           # and 
   position()=1] # which is the first child element
  ]               
  /text()        # return the text of those elements 
from lxml import html

text = '''<ul>
    <li class="name"> James </li>
    <li> Male </li>
    <li> 5'8" </li>
</ul>
<ul>
    <li class="name"> James </li>
    <li> Male </li>
    <li> 5'8" </li>
</ul>
<ul>
    <li class="name"> James </li>
    <li> Male </li>
    <li> 5'8" </li>
</ul>'''

tree = html.fromstring(text)
for ul in tree.xpath('//ul[li[@class="name"]]'):  # loop through the ul tag, whose child tag contains class attribute and the value is 'name'
    print(ul.xpath("li/text()")) # get all the text in the li tag
[' James ', ' Male ', ' 5\'8" ']
[' James ', ' Male ', ' 5\'8" ']
[' James ', ' Male ', ' 5\'8" ']