使用DOM用PHP解析复杂的XML文件

使用DOM用PHP解析复杂的XML文件,php,xml,parsing,Php,Xml,Parsing,我想从XML文件中解析和提取信息,例如,我想提取以下内容: uio,batchId和creationDate来自标题 正文中的所有accountToken、Id、setId、金额等 页脚的batchCount和TotalAmount 这是我的XML文件: <?xml version="1.0" encoding="UTF-8"?> <c:Instructions xmlns:c="http://www.localhost.com/platform"> <c:

我想从XML文件中解析和提取信息,例如,我想提取以下内容:

  • uio
    batchId
    creationDate
    来自标题

  • 正文中的所有
    accountToken
    Id
    setId
    金额

  • 页脚的
    batchCount
    TotalAmount

这是我的XML文件:

<?xml version="1.0" encoding="UTF-8"?>
<c:Instructions xmlns:c="http://www.localhost.com/platform">
  <c:Header uio="a881-aa05-1231391408a2" batchId="c7-8ef6-eb81b345e736" creationDate="2014-08-10T00:00:00.000Z" />
  <c:Instructions accountToken="0001578066518896635248066746078163233357907196" Id="4178- a6dd-d1459cda71c3" setId="132530196846" Amount="27.00" Description="GoulSalons and Spas" Timestamp="2014-08-10T05:37:56.000Z" TransactionId="1324300196883" TransactionTimestamp="2014-08-07T18:32:30.000Z" merchant="1307" consumer="1_4f13eb-4efb-b450- ca747763fbc4" store="363" campaign="Partner, Parnd Spas, Partner, Pilot, 5/30/14" />
  <c:Instructions accountToken="000227229359641325887385737985006" Id="-08eb-43dd-884b-ccae980372f8" setId="2271109667569" Amount="12.24" Description="Pyro's Pi" Timestamp="2014-08-10T03:00:05.000Z" TransactionId="291153267592" TransactionTimestamp="2014-08-07T00:00:00.000Z" merchant="13" consumer="0d3-4ef3-8922-932f0d860012" store="31" campaign=" Challenge Pyro&amp;#39;s Partner, Pilot, 4/4/14" />
  <c:Instructions accountToken="0002108430726669005078952425" Id="bf48-4f86-84f6-df69432ef65b" setId="1211100232621" Amount="26.95" Description="Blue" Timestamp="2014-08-10T05:37:20.000Z" TransactionId="121030232642" TransactionTimestamp="2014-08-07T17:48:29.000Z" merchant="104880" consumer="2-4d32-a2b4-f0b54a8e50b5" store="39" campaign="Partner Challenge Blue Fin, Pilot, 5/30/14" />
  <c:Instructions accountToken="000341863769868297728447318744937673" Id="bf48-4f86-84f6-df69432ef65b" setId="1260320211819" Amount="52.00" Description="Fin" Timestamp="2014-08-10T05:37:41.000Z" TransactionId="1259211836" TransactionTimestamp="2014-08-08T02:41:47.000Z" merchant="180" consumer="6be4-46cd-95b8-244ab78c50ce" store="52" campaign="Partner Challenge Blue Fin, Partner, Pilot, 5/30/14" />
  <c:Instructions accountToken="000521692104031759552776822005" Id="42f0-4850-9e33-54e7d79927d9" setId="29126329667269" Amount="17.00" Description=" Bear" Timestamp="2014-08-10T03:00:05.000Z" TransactionId="291259667289" TransactionTimestamp="2014-08-08T00:00:00.000Z" merchant="137" consumer="71bb-46d2-8e42-c9798d7dd0d7" store="39" campaign="Partner Challenge Blind Bear, Partner, Pilot, 5/22/14" />
  <c:Instructions accountToken="0005216177101271759552776822005" Id="42f0-4850-9e33-54e7d79927d9" setId="29134327117182" Amount="9.00" Description="Bear" Timestamp="2014-08-10T03:00:05.000Z" TransactionId="29124667297" TransactionTimestamp="2014-08-08T00:00:00.000Z" merchant="132" consumer="71bb-46d2-8e42-c9798d7dd0d7" store="398" campaign="   Bear, Partner, Pilot, 5" />
  <c:Footer batchCount="6" totalAmount="144" />
</c:Instructions>

但是我无法从XML文件中获得任何信息。

这应该是指令而不是GetElementsByTagnames调用中的指令吗?

您拥有的XML文档有点奇怪,因为顶部元素与其直接子元素相同。出于这个原因,我将使用它从文档中检索您想要的元素(好的,XPath非常棒!)。要使用DOMXPath,您需要创建一个新的DOMXPath对象并注册名称空间,
http://www.localhost.com/platform
,因此它可以搜索这些元素

注意:您的脚本无法工作,因为文档中没有
指令
元素——它们都是
指令
<代码>:)

下面是一个简单且易于扩展的脚本,它将从您发布的文档中提取数据。它只是打印数据,但您可能想用它做一些更奇特的事情

$doc = new DOMDocument;
$doc->load( $your_xml_here );

# create the DOMXPath object
$xp = new DOMXPath($doc);
# registers the namespace; to search for nodes in this namespace, prefix them with "c"
$xp->registerNamespace("c", 'http://www.localhost.com/platform');

# search for all c:Header nodes under the top node, c:Instructions
foreach ($xp->query("/c:Instructions/c:Header") as $h) {
    # array of attributes to retrieve...
    foreach (array('uio', 'batchId', 'creationDate') as $ha) {
        print "header attribute $ha: " . $h->getAttribute($ha) . PHP_EOL;
    }
}

# retrieves c:Instructions nodes that are under the c:Instructions node
foreach ($xp->query("/c:Instructions/c:Instructions") as $i) {
    # you can expand the list of attributes here
    foreach (array("Id", "accountToken") as $ia) {
        print "Instruction attrib $ia: " . $i->getAttribute($ia) . PHP_EOL;
    }
}

# footer information    
foreach ($xp->query("/c:Instructions/c:Footer") as $f) {
    foreach (array("batchCount", "totalAmount") as $fa) {
        print "footer attribute $fa: " . $f->getAttribute($fa) . PHP_EOL;
    }
}
您发布的XML的输出:

header attribute uio: a881-aa05-1231391408a2
header attribute batchId: c7-8ef6-eb81b345e736
header attribute creationDate: 2014-08-10T00:00:00.000Z
Instruction attrib Id: 4178- a6dd-d1459cda71c3
Instruction attrib accountToken: 0001578066518896635248066746078163233357907196
Instruction attrib Id: -08eb-43dd-884b-ccae980372f8
Instruction attrib accountToken: 000227229359641325887385737985006
Instruction attrib Id: bf48-4f86-84f6-df69432ef65b
Instruction attrib accountToken: 0002108430726669005078952425
Instruction attrib Id: bf48-4f86-84f6-df69432ef65b
Instruction attrib accountToken: 000341863769868297728447318744937673
Instruction attrib Id: 42f0-4850-9e33-54e7d79927d9
Instruction attrib accountToken: 000521692104031759552776822005
Instruction attrib Id: 42f0-4850-9e33-54e7d79927d9
Instruction attrib accountToken: 0005216177101271759552776822005
footer attribute batchCount: 6
footer attribute totalAmount: 144
附录:如果您获得了所有属性,那么使用SimpleXMLElement运行等效代码可能会更快:

$sxe = new SimpleXMLElement( $xml_source );
$sxe->registerXPathNamespace("c", 'http://www.localhost.com/platform');

# e.g. get the header data
foreach ($sxe->xpath("/c:Instructions/c:Header") as $i) {
    # iterate through all the element attributes
    foreach ($i->attributes() as $name => $value) {
        print "header attribute $name is $value" . PHP_EOL;
    }
}
输出:

header attribute uio is a881-aa05-1231391408a2
header attribute batchId is c7-8ef6-eb81b345e736
header attribute creationDate is 2014-08-10T00:00:00.000Z

这是一个奇怪的XML文件;与父元素同名但功能不同的子元素?
print\r($rows)
foreach
循环之前显示了什么?@miken32
print\r($rows)
不会显示任何内容,因为文档中没有名为
指令的节点。这并不能回答问题(尽管观察结果是正确的!)。
header attribute uio is a881-aa05-1231391408a2
header attribute batchId is c7-8ef6-eb81b345e736
header attribute creationDate is 2014-08-10T00:00:00.000Z