Python 从具有属性的特定嵌套节点提取文本
我正在尝试编写XPath,它将在Python 从具有属性的特定嵌套节点提取文本,python,xpath,Python,Xpath,我正在尝试编写XPath,它将在div[@class=“content”]下选择、和标记,但使用p[position()>1和position()1和position()
div[@class=“content”]
下选择、和
标记,但使用p[position()>1和position()
到目前为止,我有这个
//div[@class="content"]/*[self::h3 or self::ul or self::p[position() > 1 and position() < last() - 1]]//text()
//div[@class=“content”]/*[self::h3或self::ul或self::p[position()>1和position()
但它不起作用
这是HTML:好的,您的XML格式不正确,所以我先修复了这个问题
<?xml version="1.0" encoding="UTF-8"?>
<div class="content">
<h1/>
<h2>
<p>Certified Nursing Assistant - Full Time</p>
Job Summary</h2>
<p>Responsible for providing personal care and assistance for residents in long
term care facility.</p>
<h2>
</h2>
<h3>Essential Functions:</h3>
<ul>
<li>
<span style="line-height: 1.5;">Responsible</span> for providing
personal care and assistance to residents </li>
<li>Assist residents in and out of bed, dressing, feeding, grooming and
personal hygiene. </li>
<li>Provide basic treatments as required and directed by nursing staff.
</li>
<li>Responsible for observing and reporting changes in residents' physical
and emotional conditions to charge nurse. </li>
</ul>
<h3>Qualifications: </h3>
<p>Education:</p>
<ul>
<li>High school diploma or equivalent </li>
<li>Successful completion of state approved certified nursing assistance
course </li>
</ul>
<p>Experience:</p>
<ul>
<li>Previous health care related experience preferred </li>
</ul>
<a id="ctl00_ctl01_namelink" class="btn" href="employment-application.aspx?
positionid=34">Apply Online</a>
<br/>
<br/>
<h2>
Apply in Person</h2>
<p>
To apply in persion please stop by Shenandoah Medical Center to pick up a job
application.</p>
<h2>
Apply by Mail</h2>
<p>
To apply by mail, download and print <a target="_blank" href="/filesimages/Careers/SMC
Employment Application.pdf">
this form</a>. Please fill out the application and then mail to:<br/>
<br/>
<strong>Shenandoah Medical Center, Human Resources<br/>
</strong>300 Pershing Avenue<br/>
Shenandoah, IA 51601</p>
</div>
注册护士助理-全职
工作总结
负责为长期居住的居民提供个人护理和帮助
长期护理设施
基本功能:
-
负责提供
对居民的个人护理和援助
- 协助住院医师上下床、穿衣、喂食、梳洗和护理
个人卫生李>
- 按照护理人员的要求和指示提供基本治疗。
- 负责观察和报告居民身体状况的变化
以及护士长的情绪状况李>
资格:
教育:
- 高中文凭或同等学历
- 成功完成国家批准的认证护理援助
课程
经验:
- 有医疗保健相关经验者优先
亲自申请
如需在波斯申请,请前往Shenandoah医疗中心领取工作
应用程序
邮寄申请
如需邮寄申请,请下载并打印。请填写申请表,然后邮寄至:
Shenandoah医疗中心,人力资源部
300潘兴大道
伊利诺伊州谢南多阿市,邮编:51601
现在,如果我正确理解您的问题,您将查找所有h3、ul和p标记,它们是div[@class=“content”]的子节点,并且每个选定的子节点必须满足条件[position()>1和position()//div[@class="content"]/h3[position() > 1 and position() < last() - 1] |
//div[@class="content"]/p[position() > 1 and position() < last() - 1] |
//div[@class="content"]/ul[position() > 1 and position() < last() - 1]
//div[@class=“content”]/h3[position()>1和position()1和position()1和position()
I在firebug+firetath
中工作。您是否尝试导入lxml.html