在Python中从ElementTree访问节点中的备用属性
我有以下XML文件:在Python中从ElementTree访问节点中的备用属性,python,xml,parsing,Python,Xml,Parsing,我有以下XML文件: <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE MedlineCitationSet PUBLIC "-//NLM//DTD Medline Citation, 1st January, 2014//EN" "http://www.nlm.nih.gov/databases/dtd/nlmmedlinecitationset_14010
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE MedlineCitationSet PUBLIC "-//NLM//DTD Medline Citation, 1st January, 2014//EN"
"http://www.nlm.nih.gov/databases/dtd/nlmmedlinecitationset_140101.dtd">
<MedlineCitationSet>
<MedlineCitation Owner="NLM" Status="MEDLINE">
<PMID Version="1">15326085</PMID>
<Article PubModel="Print-Electronic">
<Journal>
<JournalIssue CitedMedium="Internet">
<Volume>44</Volume>
<Issue>4</Issue>
<PubDate>
<Year>2004</Year>
<Month>Oct</Month>
</PubDate>
</JournalIssue>
<Title>Hypertension</Title>
<ISOAbbreviation>Hypertension</ISOAbbreviation>
</Journal>
<ArticleTitle>Arterial pressure lowering effect of chronic atenolol therapy in hypertension and vasoconstrictor sympathetic drive.</ArticleTitle>
<Pagination>
<MedlinePgn>454-8</MedlinePgn>
</Pagination>
<AuthorList CompleteYN="Y">
<Author ValidYN="Y">
<LastName>Burns</LastName>
<ForeName>Joanna</ForeName>
<Initials>J</Initials>
<Affiliation>Department of Cardiology, Leeds Teaching Hospitals NHS Trust, Leeds, UK. burnsjoanna1@hotmail.com</Affiliation>
</Author>
<Author ValidYN="Y">
<LastName>Mary</LastName>
<ForeName>David A S G</ForeName>
<Initials>DA</Initials>
</Author>
<Author ValidYN="Y">
<LastName>Mackintosh</LastName>
<ForeName>Alan F</ForeName>
<Initials>AF</Initials>
</Author>
<Author ValidYN="Y">
<LastName>Ball</LastName>
<ForeName>Stephen G</ForeName>
<Initials>SG</Initials>
</Author>
<Author ValidYN="Y">
<LastName>Greenwood</LastName>
<ForeName>John P</ForeName>
<Initials>JP</Initials>
</Author>
</AuthorList>
<Language>eng</Language>
<ArticleDate DateType="Electronic">
<Year>2004</Year>
<Month>08</Month>
<Day>23</Day>
</ArticleDate>
</Article>
</MedlineCitation>
<MedlineCitation Owner="NLM" Status="In-Data-Review">
<PMID Version="1">24096967</PMID>
<Article PubModel="Print-Electronic">
<Journal>
<JournalIssue CitedMedium="Internet">
<Volume>31</Volume>
<Issue>3</Issue>
<PubDate>
<Year>2014</Year>
<Month>Mar</Month>
</PubDate>
</JournalIssue>
<Title>Pharmaceutical research</Title>
<ISOAbbreviation>Pharm. Res.</ISOAbbreviation>
</Journal>
<ArticleTitle>Semi-mechanistic Modelling of the Analgesic Effect of Gabapentin in the Formalin-Induced Rat Model of Experimental Pain.</ArticleTitle>
<Pagination>
<MedlinePgn>593-606</MedlinePgn>
</Pagination>
<AuthorList CompleteYN="Y">
<Author ValidYN="Y">
<LastName>Taneja</LastName>
<ForeName>A</ForeName>
<Initials>A</Initials>
<Affiliation>Division of Pharmacology, Leiden Academic Centre for Drug Research, POBox 9502, 2300 RA, Leiden, The Netherlands.</Affiliation>
</Author>
<Author ValidYN="Y">
<LastName>Troconiz</LastName>
<ForeName>I F</ForeName>
<Initials>IF</Initials>
</Author>
<Author ValidYN="Y">
<LastName>Danhof</LastName>
<ForeName>M</ForeName>
<Initials>M</Initials>
</Author>
<Author ValidYN="Y">
<LastName>Della Pasqua</LastName>
<ForeName>O</ForeName>
<Initials>O</Initials>
</Author>
<Author ValidYN="Y">
<CollectiveName>neuropathic pain project of the PKPD modelling platform</CollectiveName>
</Author>
</AuthorList>
<Language>eng</Language>
<PublicationTypeList>
<PublicationType>Journal Article</PublicationType>
</PublicationTypeList>
<ArticleDate DateType="Electronic">
<Year>2013</Year>
<Month>10</Month>
<Day>05</Day>
</ArticleDate>
</Article>
</MedlineCitation>
</MedlineCitationSet>
但为什么这段代码未能捕获第二条中的“集体名称”
#!/usr/bin/env python
import xml.etree.ElementTree as ET
def parse_xml(xmlfile):
"""docstring for parse_xml"""
tree = ET.parse(xmlfile)
root = tree.getroot()
for medcit in root.findall('MedlineCitation'):
pmid = medcit.find('PMID').text
authors = medcit.find('Article/AuthorList/')
lnlist = []
for auth in authors:
lastname = auth.find('LastName').text.encode('utf8')
colcname = auth.find('CollectiveName').text
if lastname is not None:
lnlist.append(lastname)
elif colcname is not None:
lnlist.append(colcname)
print pmid, ",".join(lnlist)
parse_xml('myfile.xml')
上述代码的输出如下所示:
Traceback (most recent call last):
File "test.py", line 70, in <module>
parse_xml(fvar)
File "test.py", line 49, in parse_xml
colcname = auth.find('CollectiveName').text
AttributeError: 'NoneType' object has no attribute 'text'
回溯(最近一次呼叫最后一次):
文件“test.py”,第70行,在
解析xml(fvar)
parse_xml中的文件“test.py”,第49行
colcname=auth.find('CollectiveName').text
AttributeError:“非类型”对象没有属性“文本”
只有在找到节点时才抓取文本
:
for auth in authors:
lastname = auth.find('LastName')
if lastname is not None:
lnlist.append(lastname.text.encode('utf8'))
else:
colcname = auth.find('CollectiveName')
if colcname is not None:
lnlist.append(colcname.text)
for auth in authors:
lastname = auth.find('LastName')
if lastname is not None:
lnlist.append(lastname.text.encode('utf8'))
else:
colcname = auth.find('CollectiveName')
if colcname is not None:
lnlist.append(colcname.text)