从平面xml创建嵌套xml时遇到问题

从平面xml创建嵌套xml时遇到问题,xml,xslt,Xml,Xslt,我试图使用XSLT从平面xml创建嵌套xml,但是我发现它只创建一个嵌套,而忽略了源xml中的其余记录 我的XML输入如下所示: <?xml version="1.0" encoding="ISO-8859-1" ?> <!-- Data --> <table name="ecatalogue"> <!-- Row 1 --> <tuple> <atom name="irn">2470</atom>

我试图使用XSLT从平面xml创建嵌套xml,但是我发现它只创建一个嵌套,而忽略了源xml中的其余记录

我的XML输入如下所示:

<?xml version="1.0" encoding="ISO-8859-1" ?>
<!-- Data -->
<table name="ecatalogue">
  <!-- Row 1 -->
  <tuple>
    <atom name="irn">2470</atom>
    <atom name="EADUnitID">da.01</atom>
    <atom name="EADUnitTitle">Some title</atom>
    <tuple name="AssParentObjectRef" />
  </tuple>
    <!-- Row 2 -->
  <tuple>
    <atom name="irn">5416</atom>
    <atom name="EADUnitID">da.01.01</atom>
    <atom name="EADUnitTitle">Child of Some title</atom>
    <tuple name="AssParentObjectRef">
    <atom name="EADUnitTitle">Some Title</atom>
    <atom name="irn">2470</atom>
    </tuple>
  </tuple>
    <!-- Row 3 -->
  <tuple>
    <atom name="irn">6</atom>
    <atom name="EADUnitID">da.01.02</atom>
    <atom name="EADUnitTitle">Child of Some title 2</atom>
    <tuple name="AssParentObjectRef">
    <atom name="EADUnitTitle">Some Title</atom>
    <atom name="irn">2470</atom>
    </tuple>
  </tuple>
    <!-- Row 4 -->
  <tuple>
    <atom name="irn">8</atom>
    <atom name="EADUnitID">da.01.02.01</atom>
    <atom name="EADUnitTitle">3rd Generation</atom>
    <tuple name="AssParentObjectRef">
    <atom name="EADUnitTitle">Child of Some Title 2</atom>
    <atom name="irn">6</atom>
    </tuple>
  </tuple>
    <!-- Row 5 -->
  <tuple>
    <atom name="irn">1130</atom>
    <atom name="EADUnitID">da.02</atom>
    <atom name="EADUnitTitle">Another title</atom>
    <tuple name="AssParentObjectRef" />
  </tuple>
    <!-- Row 6 -->
  <tuple>
    <atom name="irn">54</atom>
    <atom name="EADUnitID">da.02.01</atom>
    <atom name="EADUnitTitle">Child of Another title</atom>
    <tuple name="AssParentObjectRef">
    <atom name="EADUnitTitle">Another Title</atom>
    <atom name="irn">1130</atom>
    </tuple>
  </tuple>
    <!-- Row 7 -->
  <tuple>
    <atom name="irn">16</atom>
    <atom name="EADUnitID">da.02.02</atom>
    <atom name="EADUnitTitle">Child of Another Title 2</atom>
    <tuple name="AssParentObjectRef">
    <atom name="EADUnitTitle">Another Title</atom>
    <atom name="irn">1130</atom>
    </tuple>
  </tuple>
    <!-- Row 8 -->
  <tuple>
    <atom name="irn">22</atom>
    <atom name="EADUnitID">da.02.02.01</atom>
    <atom name="EADUnitTitle">3rd Generation</atom>
    <tuple name="AssParentObjectRef">
    <atom name="EADUnitTitle">Child of Another Title 2</atom>
    <atom name="irn">1130</atom>
    </tuple>
  </tuple>
</table>

2470
da.01
一些头衔
5416
da.01.01
有头衔的孩子
一些头衔
2470
6.
da.01.02
有头衔的孩子2
一些头衔
2470
8.
da.01.02.01
第三代
有头衔的孩子2
6.
1130
da.02
另一个标题
54
da.02.01
另一头衔的孩子
另一个标题
1130
16
da.02.02
另一头衔的子女2
另一个标题
1130
22
da.02.02.01
第三代
另一头衔的子女2
1130
XSLT应该识别顶级记录,然后添加子记录。对于top记录,它应将其irn和EADUNTITLE分别复制为TopID和TopTitle。对于每个子级,它应该包括直接的ParentID和ParentTitle以及TopID和TopTitle。输出应该如下所示:

<?xml version="1.0" encoding="UTF-8"?>
<table name="ecatalogue">
   <collection>
      <tuple>
         <atom name="irn">2470</atom>
         <atom name="EADUnitID">da.01</atom>
         <atom name="EADUnitTitle">Some title</atom>
         <atom name="TopTitle">Some title</atom>
         <atom name="TopID">2470</atom>
         <tuple name="children">
            <tuple>
               <atom name="irn">5416</atom>
               <atom name="EADUnitID">da.01.01</atom>
               <atom name="EADUnitTitle">Child of Some title</atom>
               <atom name="ParentTitle">Some title</atom>
               <atom name="ParentID">2470</atom>
               <atom name="TopTitle">Some title</atom>
               <atom name="TopID">2470</atom>
            </tuple>
            <tuple>
                <atom name="irn">6</atom>
               <atom name="EADUnitID">da.01.02</atom>
               <atom name="EADUnitTitle">Child of Some title 2</atom>
               <atom name="ParentTitle">Some title</atom>
               <atom name="ParentID">2470</atom>
               <atom name="TopTitle">Some title</atom>
               <atom name="TopID">2470</atom>
               <tuple name="children">
                  <tuple>
                    <atom name="irn">8</atom>
                    <atom name="EADUnitID">da.01.02.01</atom>
                    <atom name="EADUnitTitle">3rd Generation</atom>
                    <atom name="ParentTitle">Child of Some title 2</atom>
                    <atom name="ParentID">6</atom>
                    <atom name="TopTitle">Some title</atom>
                    <atom name="TopID">2470</atom>
                  </tuple>
               </tuple>
            </tuple>
         </tuple>
      </tuple>
   </collection>
   <collection>
      <tuple>
         <atom name="irn">1130</atom>
         <atom name="EADUnitID">da.02</atom>
         <atom name="EADUnitTitle">Another title</atom>
         <atom name="TopTitle">Another title</atom>
         <atom name="TopID">1130</atom>
         <tuple name="children">
            <tuple>
               <atom name="irn">54</atom>
               <atom name="EADUnitID">da.02.01</atom>
               <atom name="EADUnitTitle">Child of Another title</atom>
               <atom name="ParentTitle">Another title</atom>
               <atom name="ParentID">1130</atom>
               <atom name="TopTitle">Another title</atom>
               <atom name="TopID">1130</atom>
            </tuple>
            <tuple>
                <atom name="irn">16</atom>
               <atom name="EADUnitID">da.02.02</atom>
               <atom name="EADUnitTitle">Child of Another title 2</atom>
               <atom name="ParentTitle">Another title</atom>
               <atom name="ParentID">1130</atom>
               <atom name="TopTitle">Another title</atom>
               <atom name="TopID">1130</atom>
               <tuple name="children">
                  <tuple>
                    <atom name="irn">22</atom>
                    <atom name="EADUnitID">da.02.02.01</atom>
                    <atom name="EADUnitTitle">3rd Generation</atom>
                    <atom name="ParentTitle">Child of Another title 2</atom>
                    <atom name="ParentID">16</atom>
                    <atom name="TopTitle">Another title</atom>
                    <atom name="TopID">1130</atom>
                  </tuple>
               </tuple>
            </tuple>
         </tuple>
      </tuple>

....

   </collection>
</table>

2470
da.01
一些头衔
一些头衔
2470
5416
da.01.01
有头衔的孩子
一些头衔
2470
一些头衔
2470
6.
da.01.02
有头衔的孩子2
一些头衔
2470
一些头衔
2470
8.
da.01.02.01
第三代
有头衔的孩子2
6.
一些头衔
2470
1130
da.02
另一个标题
另一个标题
1130
54
da.02.01
另一头衔的孩子
另一个标题
1130
另一个标题
1130
16
da.02.02
另一头衔的子女2
另一个标题
1130
另一个标题
1130
22
da.02.02.01
第三代
另一头衔的子女2
16
另一个标题
1130
....
我拥有的XSLT是:

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>

<xsl:key name="child" match="tuple" use="tuple[@name='AssParentObjectRef']/atom[@name='irn']" />

<xsl:template match="/table">
    <table name="ecatalogue">
        <collection>
            <xsl:apply-templates select="tuple[not(tuple[@name='AssParentObjectRef']/atom[@name='irn'])]"/>
        </collection>
    </table>
</xsl:template>

<xsl:template match="tuple">
    <tuple>
        <xsl:copy-of select="atom"/>
        <xsl:if test="key('child', atom[@name='irn'])">
            <tuple name="children">
                <xsl:apply-templates select="key('child', atom[@name='irn'])"/>
             </tuple>
        </xsl:if>
    </tuple>
</xsl:template>

</xsl:stylesheet>

虽然这将对记录进行分组,但输出只是这些集合中的一个。所以从一个3524条记录的文件中,我得到了一个24条记录的集合

我尝试过用XSLT替换:

<xsl:template match="/table">
    <table name="ecatalogue">
        <collection>
            <xsl:apply-templates select="tuple[not(tuple[@name='AssParentObjectRef']/atom[@name='irn'])]"/>
        </collection>
    </table>
</xsl:template>

与:


虽然这会返回所有嵌套结构,但它也会复制嵌套中的记录,从而使它们本身成为集合

你知道我哪里出错了吗

2017年6月6日编辑

当我使用:

<xsl:template match="node()|@*">
       <xsl:copy>
         <xsl:apply-templates select="node()|@*"/>
     </xsl:copy>
    </xsl:template>

我得到了重复项(注意:下面示例中的“id”已添加以供说明):


是否仍有删除重复项的方法,以便只剩下嵌套记录

编辑-问题元组

 <!-- Row 3378 -->
  <tuple>
    <atom name="irn">115024</atom>
    <atom name="ObjectType">Archives</atom>
    <atom name="EADLevelAttribute">Series</atom>
    <atom name="EADUnitID">D42.PL.05</atom>
    <atom name="EADUnitTitle">Correspondence and Company Administration: Box Files</atom>
    <atom name="EADScopeAndContent">Box files of Port Line official company correspondence and administrative papers. These papers were collected towards historical research and include correspondence from earlier periods c.1890 although the bulk of the papers relate to the two periods 1937-1939 and 1949-1951.</atom>
    <atom name="EADBiographyOrHistory"></atom>
    <tuple name="AssParentObjectRef">
    </tuple>
    <atom name="EADArrangement">The papers in this series have been retained in the original order as stored by Port Line Ltd. The contents of each box file are listed as a typescript paper and have been listed in this catalogue. Box file titles have been listed in the title field of each item in this series.</atom>
    <atom name="EADUnitDate">1890-1952</atom>
    <table name="EADExtent_tab">
      <tuple>
        <atom name="EADExtent">7 boxes.</atom>
      </tuple>
    </table>
    <atom name="EADAccruals"></atom>
    <atom name="EADOtherFindingAid"></atom>
    <atom name="EADRelatedMaterial"></atom>
    <tuple name="EADAcquisitionInformationRef">
    </tuple>
    <atom name="EADAppraisalInformation"></atom>
    <atom name="EADSeparatedMaterial"></atom>
    <atom name="EADTitleProper"></atom>
    <atom name="EADPublicationStatement"></atom>
    <atom name="EADCustodialHistory"></atom>
    <atom name="EADSource"></atom>
    <atom name="EADNote"></atom>
    <atom name="EADAccessRestrictions">Some items in this series are closed access.</atom>
    <atom name="EADUseRestrictions"></atom>
  </tuple>

  <!-- Row 3379 -->
  <tuple>
    <atom name="irn">115025</atom>
    <atom name="ObjectType">Archives</atom>
    <atom name="EADLevelAttribute">Item</atom>
    <atom name="EADUnitID">D42.PL.05.01</atom>
    <atom name="EADUnitTitle">File: Australian Homeward Trade</atom>
    <atom name="EADScopeAndContent">Various papers relating to Australian Homeward Trade and includes the following:For proof copies of the Australian Homeward Agreement see D42/PL5/6.</atom>
    <atom name="EADBiographyOrHistory"></atom>
    <tuple name="AssParentObjectRef">
      <atom name="EADUnitTitle">Correspondence and Company Administration: Box Files</atom>
      <atom name="irn">115024</atom>
    </tuple>
    <atom name="EADArrangement"></atom>
    <atom name="EADUnitDate">1920-1936</atom>
    <table name="EADExtent_tab">
      <tuple>
        <atom name="EADExtent">1 file.</atom>
      </tuple>
    </table>
    <atom name="EADAccruals"></atom>
    <atom name="EADOtherFindingAid"></atom>
    <atom name="EADRelatedMaterial"></atom>
    <tuple name="EADAcquisitionInformationRef">
    </tuple>
    <atom name="EADAppraisalInformation"></atom>
    <atom name="EADSeparatedMaterial"></atom>
    <atom name="EADTitleProper"></atom>
    <atom name="EADPublicationStatement"></atom>
    <atom name="EADCustodialHistory"></atom>
    <atom name="EADSource"></atom>
    <atom name="EADNote"></atom>
    <atom name="EADAccessRestrictions"></atom>
    <atom name="EADUseRestrictions"></atom>
  </tuple>

115024
档案室
系列
D42.PL.05
通信和公司管理:箱文件
港口线官方公司信函和行政文件的箱文件。这些论文是为进行历史研究而收集的,其中包括1890年前后的信件,尽管大部分论文涉及1937-1939年和1949-1951年这两个时期。
本系列中的文件已按Port Line Ltd.存储的原始顺序保留。每个方框文件的内容均以打字稿纸的形式列出,并已在本目录中列出。方框文件标题已列在本系列中每个项目的标题字段中。
1890-1952
7盒。
本系列中的某些项目为封闭访问。
115025
档案室
项目
D42.PL.05.01
档案:澳大利亚归国贸易
与澳大利亚回国贸易有关的各种文件,包括以下内容:澳大利亚回国协议的证明副本见D42/PL5/6。
通信和公司管理:箱文件
115024
1920-1936
1个文件。

如果您想要为每个顶级父级
元组
创建一个
集合
元素,我认为您需要做的就是为每个
创建一个
xsl:for each
来获取父级,并在其中移动
集合
元素的创建

<xsl:template match="/table">
    <table name="ecatalogue">
        <xsl:for-each select="tuple[not(tuple[@name='AssParentObjectRef']/atom[@name='irn'])]">
            <collection>
                <xsl:apply-templates select="." />
            </collection>
        </xsl:for-each>
    </table>
</xsl:template>

这有点长;我试图解决所有引起我注意的相关问题

现有模板 首先,,
 <!-- Row 3378 -->
  <tuple>
    <atom name="irn">115024</atom>
    <atom name="ObjectType">Archives</atom>
    <atom name="EADLevelAttribute">Series</atom>
    <atom name="EADUnitID">D42.PL.05</atom>
    <atom name="EADUnitTitle">Correspondence and Company Administration: Box Files</atom>
    <atom name="EADScopeAndContent">Box files of Port Line official company correspondence and administrative papers. These papers were collected towards historical research and include correspondence from earlier periods c.1890 although the bulk of the papers relate to the two periods 1937-1939 and 1949-1951.</atom>
    <atom name="EADBiographyOrHistory"></atom>
    <tuple name="AssParentObjectRef">
    </tuple>
    <atom name="EADArrangement">The papers in this series have been retained in the original order as stored by Port Line Ltd. The contents of each box file are listed as a typescript paper and have been listed in this catalogue. Box file titles have been listed in the title field of each item in this series.</atom>
    <atom name="EADUnitDate">1890-1952</atom>
    <table name="EADExtent_tab">
      <tuple>
        <atom name="EADExtent">7 boxes.</atom>
      </tuple>
    </table>
    <atom name="EADAccruals"></atom>
    <atom name="EADOtherFindingAid"></atom>
    <atom name="EADRelatedMaterial"></atom>
    <tuple name="EADAcquisitionInformationRef">
    </tuple>
    <atom name="EADAppraisalInformation"></atom>
    <atom name="EADSeparatedMaterial"></atom>
    <atom name="EADTitleProper"></atom>
    <atom name="EADPublicationStatement"></atom>
    <atom name="EADCustodialHistory"></atom>
    <atom name="EADSource"></atom>
    <atom name="EADNote"></atom>
    <atom name="EADAccessRestrictions">Some items in this series are closed access.</atom>
    <atom name="EADUseRestrictions"></atom>
  </tuple>

  <!-- Row 3379 -->
  <tuple>
    <atom name="irn">115025</atom>
    <atom name="ObjectType">Archives</atom>
    <atom name="EADLevelAttribute">Item</atom>
    <atom name="EADUnitID">D42.PL.05.01</atom>
    <atom name="EADUnitTitle">File: Australian Homeward Trade</atom>
    <atom name="EADScopeAndContent">Various papers relating to Australian Homeward Trade and includes the following:For proof copies of the Australian Homeward Agreement see D42/PL5/6.</atom>
    <atom name="EADBiographyOrHistory"></atom>
    <tuple name="AssParentObjectRef">
      <atom name="EADUnitTitle">Correspondence and Company Administration: Box Files</atom>
      <atom name="irn">115024</atom>
    </tuple>
    <atom name="EADArrangement"></atom>
    <atom name="EADUnitDate">1920-1936</atom>
    <table name="EADExtent_tab">
      <tuple>
        <atom name="EADExtent">1 file.</atom>
      </tuple>
    </table>
    <atom name="EADAccruals"></atom>
    <atom name="EADOtherFindingAid"></atom>
    <atom name="EADRelatedMaterial"></atom>
    <tuple name="EADAcquisitionInformationRef">
    </tuple>
    <atom name="EADAppraisalInformation"></atom>
    <atom name="EADSeparatedMaterial"></atom>
    <atom name="EADTitleProper"></atom>
    <atom name="EADPublicationStatement"></atom>
    <atom name="EADCustodialHistory"></atom>
    <atom name="EADSource"></atom>
    <atom name="EADNote"></atom>
    <atom name="EADAccessRestrictions"></atom>
    <atom name="EADUseRestrictions"></atom>
  </tuple>
<xsl:template match="/table">
    <table name="ecatalogue">
        <xsl:for-each select="tuple[not(tuple[@name='AssParentObjectRef']/atom[@name='irn'])]">
            <collection>
                <xsl:apply-templates select="." />
            </collection>
        </xsl:for-each>
    </table>
</xsl:template>
<xsl:template match="/table">
    <table name="ecatalogue">
        <collection>
            <xsl:apply-templates select="tuple[not(tuple[@name='AssParentObjectRef']/atom[@name='irn'])]"/>
        </collection>
    </table>
</xsl:template>
select="tuple[not(tuple[@name='AssParentObjectRef']/atom[@name='irn'])]"
select="tuple[not(tuple/*)]"
<xsl:key name="child" match="tuple" use="tuple[@name='AssParentObjectRef']/atom[@name='irn']" />
<xsl:template match="tuple">
    <tuple>
        <xsl:copy-of select="atom"/>

        <!-- If this `tuple` is a parent (i.e. if it's included in
             the list of parent IDs in the key), then we add a
             wrapper for the children and process the children.  -->
        <xsl:if test="key('child', atom[@name='irn'])">
            <tuple name="children">
                <!-- Now we apply templates to the `tuple`s 
                     in the key -->
                <xsl:apply-templates select="key('child', atom[@name='irn'])"/>
            </tuple>
        </xsl:if>
    </tuple>
</xsl:template>
<xsl:template match="node()|@*">
    <xsl:copy>
        <xsl:apply-templates select="node()|@*"/>
    </xsl:copy>
</xsl:template>
<xsl:template match="table">
    <xsl:copy>
        <xsl:copy-of select="@*"/>
        <!-- Target only parent-level `tuple`s.  This excludes
            child-level tuples, helping to prevent duplicates.-->
        <xsl:apply-templates select="tuple[not(tuple/*)]" mode="top"/>
    </xsl:copy>
</xsl:template>
<!-- Add the `collection` wrapper only to top-level tuples -->
<xsl:template match="tuple" mode="top">
    <collection>
        <!-- Pass on this tuple to the main `tuple` template -->
        <xsl:apply-templates select="."/>
    </collection>
</xsl:template>
<!-- This is the main template for processing `tuple` elements.
    Most of the changes needed are common to all `tuple`s, so it
    makes sense to keep all the logic in one place. -->
<xsl:template match="tuple">
    <tuple>
        <!-- Copy each existing `atom` child -->
        <xsl:copy-of select="atom"/>
        <!-- Add in metadata about parent and top-level ancestor titles and IDs -->
            <xsl:choose>
                <!-- If this is a top-level item, just use its own values -->
                <xsl:when test="not(tuple/*)">
                    <atom name="TopTitle"><xsl:value-of select="atom[@name='EADUnitTitle']"/></atom>
                    <atom name="TopID"><xsl:value-of select="atom[@name='irn']"/></atom>
                </xsl:when>
                <!-- If this is a descendant, we need to find its parent and its top-level ancestor -->
                <xsl:when test="tuple/*">
                    <atom name="ParentTitle"><xsl:value-of select="tuple/atom[@name='EADUnitTitle']"/></atom>
                    <atom name="ParentID"><xsl:value-of select="tuple/atom[@name='irn']"/></atom>
                <!-- For convenience, grab the top-level ancestor `tuple` and stuff it in a variable.
                    This is vaguely annalogous to your use of `key`. -->

                <!-- Finding the top-level `tuple` is complicated by the fact that the ID values in 
                    `<atom name="irn">` do not have a standardized format, other than that the whole
                    strings appear to consist of atomic values separated by single periods, with 
                    descendant `irn` values appending to the precedent values.  Examples:
                      Top:        `da.04`
                      Descendant: `da.04.11.02`
                      Top:        `D42.PL.05`
                      Descendant: `D42.PL.05.01`
                    So chunking the ID values is a problematic approach, since we don't know how many
                    chunks comprise the initial non-numeric portion: `da`, or `D42.PL`, or ... ???.
                    Top-level elements *do* also have empty `<tuple name="AssParentObjectRef">` elements.
                    So we _can_ find all the top-level elements, and then look in those for the one that
                    has an `irn` value that matches the start of the `irn` value of this current `tuple`. -->
                <xsl:variable name="top" select="/table/tuple[tuple[@name='AssParentObjectRef'][not(*)]]
                    ['The above statement grabs all the `tuple`s that have an empty `tuple[@name=`AssParentObjectRef``.
                      The below statement then goes through all those `tuple`s to find the ones where the `irn`
                      values match the start of the `irn` value of the current `tuple`.']
                    [starts-with(current()/atom[@name='EADUnitID'], atom[@name='EADUnitID'])]"/>
                    <!-- Now we can reference that variable to get the top-level ancestor values -->
                    <atom name="TopTitle"><xsl:value-of select="$top/atom[@name='EADUnitTitle']"/></atom>
                    <atom name="TopID"><xsl:value-of select="$top/atom[@name='irn']"/></atom>
                </xsl:when>
            </xsl:choose>
        <!-- Process any children of this tuple, based on `irn` values.
            Basically, we look for any other `tuple`s in the `table`
            that point to this current `tuple`'s `irn` value. -->
        <xsl:if test="/table/tuple[tuple/atom[@name='irn'] = current()/atom[@name='irn']]">
            <tuple name="children">
                <xsl:apply-templates select="/table/tuple[tuple/atom[@name='irn'] = current()/atom[@name='irn']]"></xsl:apply-templates>
            </tuple>
        </xsl:if>
    </tuple>
</xsl:template>
<table name="ecatalogue">
    <tuple>
        <atom name="irn">2470</atom>
        <atom name="EADUnitID">da.01</atom>
        <atom name="EADUnitTitle">Some title</atom>
        <tuple>
            <atom name="irn">5416</atom>
            <atom name="EADUnitID">da.01.01</atom>
            <atom name="EADUnitTitle">Child of Some title</atom>
        </tuple>
        <tuple>
            <atom name="irn">6</atom>
            <atom name="EADUnitID">da.01.02</atom>
            <atom name="EADUnitTitle">Child of Some title 2</atom>
            <tuple>
                <atom name="irn">8</atom>
                <atom name="EADUnitID">da.01.02.01</atom>
                <atom name="EADUnitTitle">3rd Generation</atom>
            </tuple>
        </tuple>
    </tuple>
</table>