Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/322.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
使用Python从XML中的特定元素创建值数组_Python_Xml_Anaconda - Fatal编程技术网

使用Python从XML中的特定元素创建值数组

使用Python从XML中的特定元素创建值数组,python,xml,anaconda,Python,Xml,Anaconda,我有一个包含许多元素的XML文件。我想创建一个列表/数组,其中包含具有特定元素名称的所有值,在我的示例对中是ApplicationNumber 我已经考虑了很多其他问题,但我找不到答案。我知道我可以通过加载文本文件并使用pandas检查它来实现这一点,但是,我确信有更好的方法 我使用minidom尝试ElementTree和XML.Dom失败 我的代码目前如下所示: import os from xml.dom import minidom WindowsUser = os.getenv('us

我有一个包含许多元素的XML文件。我想创建一个列表/数组,其中包含具有特定元素名称的所有值,在我的示例对中是ApplicationNumber

我已经考虑了很多其他问题,但我找不到答案。我知道我可以通过加载文本文件并使用pandas检查它来实现这一点,但是,我确信有更好的方法

我使用minidom尝试ElementTree和XML.Dom失败

我的代码目前如下所示:

import os
from xml.dom import minidom
WindowsUser = os.getenv('username')
XMLPath = os.path.join('C:\\Users', WindowsUser, 'Downloads', 'ApplicationsByCustomerNumber.xml')
xmldoc = minidom.parse(XMLPath)
itemlist = xmldoc.getElementsByTagName('pair:ApplicationNumber')
for s in itemlist:
    print(s.attributes['pair:ApplicationNumber'].value)
<?xml version="1.0" encoding="UTF-8"?>
<pair:PatentApplicationList xsi:schemaLocation="urn:us:gov:uspto:pair PatentApplicationList.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:pair="urn:us:gov:uspto:pair">
    <pair:FileHeader>
            <pair:FileCreationTimeStamp>2017-07-10T10:52:12.12</pair:FileCreationTimeStamp>
    </pair:FileHeader>
    <pair:ApplicationStatusData>
        <pair:ApplicationNumber>62383607</pair:ApplicationNumber>
        <pair:ApplicationStatusCode>20</pair:ApplicationStatusCode>
        <pair:ApplicationStatusText>Application Dispatched from Preexam, Not Yet Docketed</pair:ApplicationStatusText>
        <pair:ApplicationStatusDate>2016-09-16</pair:ApplicationStatusDate>
        <pair:AttorneyDocketNumber>1354-T-02-US</pair:AttorneyDocketNumber>
        <pair:FilingDate>2016-09-06</pair:FilingDate>
        <pair:LastModifiedTimestamp>2017-05-30T21:40:37.37</pair:LastModifiedTimestamp>
        <pair:CustomerNumber>122761</pair:CustomerNumber><pair:LastFileHistoryTransaction>
            <pair:LastTransactionDate>2017-05-30</pair:LastTransactionDate>
            <pair:LastTransactionDescription>Email Notification</pair:LastTransactionDescription> </pair:LastFileHistoryTransaction> 
        <pair:ImageAvailabilityIndicator>true</pair:ImageAvailabilityIndicator> 
    </pair:ApplicationStatusData>
    <pair:ApplicationStatusData>
        <pair:ApplicationNumber>62292372</pair:ApplicationNumber>
        <pair:ApplicationStatusCode>160</pair:ApplicationStatusCode>
        <pair:ApplicationStatusText>Abandoned  --  Incomplete Application (Pre-examination)</pair:ApplicationStatusText>
        <pair:ApplicationStatusDate>2016-11-01</pair:ApplicationStatusDate>
        <pair:AttorneyDocketNumber>681-S-23-US</pair:AttorneyDocketNumber>
        <pair:FilingDate>2016-02-08</pair:FilingDate>
        <pair:LastModifiedTimestamp>2017-06-20T21:59:26.26</pair:LastModifiedTimestamp>
        <pair:CustomerNumber>122761</pair:CustomerNumber><pair:LastFileHistoryTransaction>
            <pair:LastTransactionDate>2017-06-20</pair:LastTransactionDate>
            <pair:LastTransactionDescription>Petition Entered</pair:LastTransactionDescription> </pair:LastFileHistoryTransaction> 
        <pair:ImageAvailabilityIndicator>true</pair:ImageAvailabilityIndicator> 
    </pair:ApplicationStatusData>
    <pair:ApplicationStatusData>
        <pair:ApplicationNumber>62289245</pair:ApplicationNumber>
        <pair:ApplicationStatusCode>160</pair:ApplicationStatusCode>
        <pair:ApplicationStatusText>Abandoned  --  Incomplete Application (Pre-examination)</pair:ApplicationStatusText>
        <pair:ApplicationStatusDate>2016-10-26</pair:ApplicationStatusDate>
        <pair:AttorneyDocketNumber>1526-P-01-US</pair:AttorneyDocketNumber>
        <pair:FilingDate>2016-01-31</pair:FilingDate>
        <pair:LastModifiedTimestamp>2017-06-15T21:24:13.13</pair:LastModifiedTimestamp>
        <pair:CustomerNumber>122761</pair:CustomerNumber><pair:LastFileHistoryTransaction>
            <pair:LastTransactionDate>2017-06-15</pair:LastTransactionDate>
            <pair:LastTransactionDescription>Petition Entered</pair:LastTransactionDescription> </pair:LastFileHistoryTransaction> 
        <pair:ImageAvailabilityIndicator>true</pair:ImageAvailabilityIndicator> 
    </pair:ApplicationStatusData>
</pair:PatentApplicationList>
示例XML文件如下所示:

import os
from xml.dom import minidom
WindowsUser = os.getenv('username')
XMLPath = os.path.join('C:\\Users', WindowsUser, 'Downloads', 'ApplicationsByCustomerNumber.xml')
xmldoc = minidom.parse(XMLPath)
itemlist = xmldoc.getElementsByTagName('pair:ApplicationNumber')
for s in itemlist:
    print(s.attributes['pair:ApplicationNumber'].value)
<?xml version="1.0" encoding="UTF-8"?>
<pair:PatentApplicationList xsi:schemaLocation="urn:us:gov:uspto:pair PatentApplicationList.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:pair="urn:us:gov:uspto:pair">
    <pair:FileHeader>
            <pair:FileCreationTimeStamp>2017-07-10T10:52:12.12</pair:FileCreationTimeStamp>
    </pair:FileHeader>
    <pair:ApplicationStatusData>
        <pair:ApplicationNumber>62383607</pair:ApplicationNumber>
        <pair:ApplicationStatusCode>20</pair:ApplicationStatusCode>
        <pair:ApplicationStatusText>Application Dispatched from Preexam, Not Yet Docketed</pair:ApplicationStatusText>
        <pair:ApplicationStatusDate>2016-09-16</pair:ApplicationStatusDate>
        <pair:AttorneyDocketNumber>1354-T-02-US</pair:AttorneyDocketNumber>
        <pair:FilingDate>2016-09-06</pair:FilingDate>
        <pair:LastModifiedTimestamp>2017-05-30T21:40:37.37</pair:LastModifiedTimestamp>
        <pair:CustomerNumber>122761</pair:CustomerNumber><pair:LastFileHistoryTransaction>
            <pair:LastTransactionDate>2017-05-30</pair:LastTransactionDate>
            <pair:LastTransactionDescription>Email Notification</pair:LastTransactionDescription> </pair:LastFileHistoryTransaction> 
        <pair:ImageAvailabilityIndicator>true</pair:ImageAvailabilityIndicator> 
    </pair:ApplicationStatusData>
    <pair:ApplicationStatusData>
        <pair:ApplicationNumber>62292372</pair:ApplicationNumber>
        <pair:ApplicationStatusCode>160</pair:ApplicationStatusCode>
        <pair:ApplicationStatusText>Abandoned  --  Incomplete Application (Pre-examination)</pair:ApplicationStatusText>
        <pair:ApplicationStatusDate>2016-11-01</pair:ApplicationStatusDate>
        <pair:AttorneyDocketNumber>681-S-23-US</pair:AttorneyDocketNumber>
        <pair:FilingDate>2016-02-08</pair:FilingDate>
        <pair:LastModifiedTimestamp>2017-06-20T21:59:26.26</pair:LastModifiedTimestamp>
        <pair:CustomerNumber>122761</pair:CustomerNumber><pair:LastFileHistoryTransaction>
            <pair:LastTransactionDate>2017-06-20</pair:LastTransactionDate>
            <pair:LastTransactionDescription>Petition Entered</pair:LastTransactionDescription> </pair:LastFileHistoryTransaction> 
        <pair:ImageAvailabilityIndicator>true</pair:ImageAvailabilityIndicator> 
    </pair:ApplicationStatusData>
    <pair:ApplicationStatusData>
        <pair:ApplicationNumber>62289245</pair:ApplicationNumber>
        <pair:ApplicationStatusCode>160</pair:ApplicationStatusCode>
        <pair:ApplicationStatusText>Abandoned  --  Incomplete Application (Pre-examination)</pair:ApplicationStatusText>
        <pair:ApplicationStatusDate>2016-10-26</pair:ApplicationStatusDate>
        <pair:AttorneyDocketNumber>1526-P-01-US</pair:AttorneyDocketNumber>
        <pair:FilingDate>2016-01-31</pair:FilingDate>
        <pair:LastModifiedTimestamp>2017-06-15T21:24:13.13</pair:LastModifiedTimestamp>
        <pair:CustomerNumber>122761</pair:CustomerNumber><pair:LastFileHistoryTransaction>
            <pair:LastTransactionDate>2017-06-15</pair:LastTransactionDate>
            <pair:LastTransactionDescription>Petition Entered</pair:LastTransactionDescription> </pair:LastFileHistoryTransaction> 
        <pair:ImageAvailabilityIndicator>true</pair:ImageAvailabilityIndicator> 
    </pair:ApplicationStatusData>
</pair:PatentApplicationList>

示例中的XML根据您使用的模式扩展了标记的pair:part,因此它与“pair:ApplicationNumber”不匹配,尽管看起来应该匹配

我使用元素树提取应用程序编号,如下所示。我在示例中使用了一个本地XML文件,而不是代码中的完整路径

例1:

from xml.etree import ElementTree

tree = ElementTree.parse('ApplicationsByCustomerNumber.xml')
root = tree.getroot()

for item in root:
    if 'ApplicationStatusData' in item.tag:
        for child in item:
            if 'ApplicationNumber' in child.tag:
                print child.text
例2:

from xml.etree import ElementTree

tree = ElementTree.parse('ApplicationsByCustomerNumber.xml')
root = tree.getroot()

for item in root.iter('{urn:us:gov:uspto:pair}ApplicationStatusData'):
    for child in item.iter('{urn:us:gov:uspto:pair}ApplicationNumber'):
        print child.text
希望这可能有用