Xml 如何按内容对元素进行分组（XSLT2.0）？_Xml_Xslt_Xslt 2.0

Xml 如何按内容对元素进行分组（XSLT2.0）？

xml xslt

Xml 如何按内容对元素进行分组（XSLT2.0）？,xml,xslt,xslt-2.0,Xml,Xslt,Xslt 2.0,--修正问题-- 感谢所有提供潜在解决方案的人，但这些都与我已经尝试过的一致，所以我想我应该更清楚。我对XML进行了一些扩展，以使问题更加透明 XML实际上是包含翻译内容的各种文件的汇编，其目的是获得一个只包含唯一英文字符串的统一文档，并且（在手动检查和清理后）每个字符串都有一个翻译的文档，因此它可以用作翻译内存。这就是为什么它现在是一个包含大量冗余信息的大文件每个段落行包含英语母版（可以在文件中重复几十次）和翻译变体。在相当多的情况下，这很容易，因为所有的翻译版本都是相同的，所以我会以一行结

--修正问题--

感谢所有提供潜在解决方案的人，但这些都与我已经尝试过的一致，所以我想我应该更清楚。我对XML进行了一些扩展，以使问题更加透明

XML实际上是包含翻译内容的各种文件的汇编，其目的是获得一个只包含唯一英文字符串的统一文档，并且（在手动检查和清理后）每个字符串都有一个翻译的文档，因此它可以用作翻译内存。这就是为什么它现在是一个包含大量冗余信息的大文件

每个段落行包含英语母版（可以在文件中重复几十次）和翻译变体。在相当多的情况下，这很容易，因为所有的翻译版本都是相同的，所以我会以一行结束，但在其他情况下，它可能更复杂

因此，假设今天我有10个段落行，其中包含相同的英语内容（#1），2个不同的德语变体，3个不同的法语变体，其余地区只有一个变体我需要得到：

1段具有：1 EN/2 DE（v1和v2）/3 FR（v1、v2和v3）/

在我的列表中，每一组独特的英语值都会重复这个过程

修改后的XML：

<Books>
<!--First English String (#1) with number of potential translations -->
<Para>
    <EN>English Content #1</EN>
    <DE>German Trans of #1 v1</DE>
    <FR>French Trans of #1 v1</FR>
    <!-- More locales here -->
</Para>
<Para>
    <EN>English Content #1</EN>
    <DE>German Trans of #1 v2</DE>
    <FR>French Trans of #1 v1</FR>
    <!-- More locales here -->
</Para>
<Para>
    <EN>English Content #1</EN>
    <DE>German Trans of #1 v1</DE>
    <FR>French Trans of #1 v2</FR>
    <!-- More locales here -->
</Para>
<!--Second English String (#2) with number of potential translations -->
<Para>
    <EN>English Content #2</EN>
    <DE>German Trans of #2 v1</DE>
    <FR>French Trans of #2 v1</FR>
    <!-- More locales here -->
</Para>
<Para>
    <EN>English Content #2</EN>
    <DE>German Trans of #2 v3</DE>
    <FR>French Trans of #2 v1</FR>
    <!-- More locales here -->
</Para>
<Para>
    <EN>English Content #2</EN>
    <DE>German Trans of #2 v2</DE>
    <FR>French Trans of #2 v1</FR>
    <!-- More locales here -->
</Para>
<!--Loads of additional English Strings (#3 ~ #n) with number of potential    translations -->


英语内容#1
第1版德语翻译
#1 v1的法语翻译
英语内容#1
第1版德语翻译
#1 v1的法语翻译
英语内容#1
第1版德语翻译
#1 v2的法文译文
英语内容#2
第2版德语翻译
#2 v1的法语翻译
英语内容#2
#2 v3的德语翻译
#2 v1的法语翻译
英语内容#2
第2版德语翻译
#2 v1的法语翻译

当前的解决方案为我提供了以下输出

<Books>
<Para>
    <EN>English Content #1</EN>
    <DE>German Trans of #1 v1</DE>
    <DE>German Trans of #1 v2</DE>
    <DE>German Trans of #2 v1</DE>
    <DE>German Trans of #2 v3</DE>
    <DE>German Trans of #2 v2</DE>
    <FR>French Trans of #1 v1</FR>
    <FR>French Trans of #1 v1</FR>
    <FR>French Trans of #1 v2</FR>
    <FR>French Trans of #2 v1</FR>
</Para>
</Books>


英语内容#1
第1版德语翻译
第1版德语翻译
第2版德语翻译
#2 v3的德语翻译
第2版德语翻译
#1 v1的法语翻译
#1 v1的法语翻译
#1 v2的法文译文
#2 v1的法语翻译

因此，只取第一个EN标记，然后将所有其他标记分组，与英语主字符串之间的差异无关。而我的目标是实现以下目标：

<Books>
<!-- First Grouped EN string and linked grouped translations -->
<Para>
    <EN>English Content #1</EN>
    <DE>German Trans of #1 v1</DE>
    <DE>German Trans of #1 v2</DE>
    <FR>French Trans of #1 v1</FR>
    <FR>French Trans of #1 v2</FR>
</Para>
<!-- Second Grouped EN string and linked grouped translations -->
<Para>
    <EN>English Content #2</EN>
    <DE>German Trans of #2 v1</DE>
    <DE>German Trans of #2 v3</DE>
    <DE>German Trans of #2 v2</DE>
    <FR>French Trans of #2 v1</FR>
</Para>
<!-- 3d to n Grouped EN string and linked grouped translations -->
</Books>


英语内容#1
第1版德语翻译
第1版德语翻译
#1 v1的法语翻译
#1 v2的法文译文
英语内容#2
第2版德语翻译
#2 v3的德语翻译
第2版德语翻译
#2 v1的法语翻译

扩展XSLT 2.0答案，以完成问题请求中的更新

<xsl:stylesheet version="2.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output omit-xml-declaration="yes" indent="yes"/>
    <xsl:strip-space elements="*"/>

    <xsl:template match="Books">
        <xsl:copy>
            <xsl:for-each-group select="*" 
                group-by="EN">
                <xsl:copy>
                   <xsl:copy-of select="EN"/>
                   <xsl:for-each-group select="current-group()/*[not(local-name()='EN')]"
                        group-by=".">
                        <xsl:sort select="local-name()"/>
                        <xsl:copy-of select="."/>
                    </xsl:for-each-group>
                </xsl:copy>
            </xsl:for-each-group>
        </xsl:copy>
    </xsl:template>

</xsl:stylesheet>

使用XSLT 2.0对每个组进行相同的结果（以及更短的转换）：

<xsl:stylesheet version="2.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output omit-xml-declaration="yes" indent="yes"/>
    <xsl:strip-space elements="*"/>

    <xsl:template match="Books">
        <xsl:copy>
            <Para>
                <xsl:copy-of select="Para[1]/EN"/>
                <xsl:for-each-group select="Para/*[not(local-name()='EN')]" 
                            group-by=".">
                    <xsl:sort select="local-name()"/>
                    <xsl:copy>
                        <xsl:value-of select="."/>
                    </xsl:copy>
                </xsl:for-each-group>
            </Para>
        </xsl:copy>
    </xsl:template>

</xsl:stylesheet>

此转换：

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output omit-xml-declaration="yes" indent="yes"/> <xsl:key name="kLangByValAndText" match="Para/*[not(self::EN)]" use="concat(name(), '+++', .)"/> <xsl:template match="/"> <Books> <Para> <xsl:copy-of select="/*/Para[1]/EN"/> <xsl:for-each select= "/*/*/*[generate-id() = generate-id(key('kLangByValAndText', concat(name(), '+++', .) ) [1] ) ] "> <xsl:sort select="name()"/> <xsl:copy-of select="."/> </xsl:for-each> </Para> </Books> </xsl:template> </xsl:stylesheet>

<Books> <Para> <EN>Some English Content</EN> <Australian>Some English Content</Australian> <DE>German Trans v1</DE> <EN-US>Some English Content</EN-US> <FR>French Trans v1</FR> <FR>French Trans v2</FR> </Para> </Books>

应用于此XML文档时（为使其更有趣，提供了扩展版本）：

一些英文内容德语翻译v1 法语翻译v1 一些英文内容一些英文内容德语翻译v1 法语翻译v1 一些英文内容一些英文内容德语翻译v1 法语翻译v2
生成所需的正确结果：

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output omit-xml-declaration="yes" indent="yes"/> <xsl:key name="kLangByValAndText" match="Para/*[not(self::EN)]" use="concat(name(), '+++', .)"/> <xsl:template match="/"> <Books> <Para> <xsl:copy-of select="/*/Para[1]/EN"/> <xsl:for-each select= "/*/*/*[generate-id() = generate-id(key('kLangByValAndText', concat(name(), '+++', .) ) [1] ) ] "> <xsl:sort select="name()"/> <xsl:copy-of select="."/> </xsl:for-each> </Para> </Books> </xsl:template> </xsl:stylesheet>

<Books> <Para> <EN>Some English Content</EN> <Australian>Some English Content</Australian> <DE>German Trans v1</DE> <EN-US>Some English Content</EN-US> <FR>French Trans v1</FR> <FR>French Trans v2</FR> </Para> </Books>

一些英文内容一些英文内容德语翻译v1 一些英文内容法语翻译v1 法语翻译v2
解释：复合（两部分）键上的明钦语分组
注意事项：仅对翻译进行分组（与此问题的另一个答案相同）会丢失
翻译-将@empo的解决方案应用于同一文档，结果是（
丢失！）：

一些英文内容德语翻译v1 一些英文内容法语翻译v1 法语翻译v2
另一个明钦族分组，子层级使用复合键：

<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output indent="yes" /> <xsl:key name="english" match="EN" use="." /> <xsl:key name="others" match="Para/*[not(self::EN)]" use="concat(../EN, ' ', ., ' ', name())" /> <xsl:template match="/Books"> <Books> <xsl:for-each select="Para/EN[generate-id() = generate-id(key('english', .)[1])]"> <Para> <xsl:copy-of select=".|key('english', .)/../*[not(self::EN)][generate-id() = generate-id(key('others', concat(current(), ' ', ., ' ', name()))[1])]" /> </Para> </xsl:for-each> </Books> </xsl:template> </xsl:stylesheet>

使用Saxon 9，当我应用样式表时

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0"> <xsl:strip-space elements="*"/> <xsl:output indent="yes"/> <xsl:template match="Books"> <xsl:copy> <xsl:for-each-group select="Para" group-by="EN"> <xsl:apply-templates select="."/> </xsl:for-each-group> </xsl:copy> </xsl:template> <xsl:template match="Para"> <xsl:copy> <xsl:copy-of select="EN"/> <xsl:for-each-group select="current-group()/(* except EN)" group-by="node-name(.)"> <xsl:for-each-group select="current-group()" group-by="."> <xsl:copy-of select="."/> </xsl:for-each-group> </xsl:for-each-group> </xsl:copy> </xsl:template> </xsl:stylesheet>

输入

<Books>  <Para> <EN>English Content #1</EN> <DE>German Trans of #1 v1</DE> <FR>French Trans of #1 v1</FR>  </Para> <Para> <EN>English Content #1</EN> <DE>German Trans of #1 v2</DE> <FR>French Trans of #1 v1</FR>  </Para> <Para> <EN>English Content #1</EN> <DE>German Trans of #1 v1</DE> <FR>French Trans of #1 v2</FR>  </Para>  <Para> <EN>English Content #2</EN> <DE>German Trans of #2 v1</DE> <FR>French Trans of #2 v1</FR>  </Para> <Para> <EN>English Content #2</EN> <DE>German Trans of #2 v3</DE> <FR>French Trans of #2 v1</FR>  </Para> <Para> <EN>English Content #2</EN> <DE>German Trans of #2 v2</DE> <FR>French Trans of #2 v1</FR>  </Para> </Books>

英语内容#1 第1版德语翻译 #1 v1的法语翻译英语内容#1 第1版德语翻译 #1 v1的法语翻译英语内容#1 第1版德语翻译 #1 v2的法文译文英语内容#2 第2版德语翻译 #2 v1的法语翻译英语内容#2 #2 v3的德语翻译 #2 v1的法语翻译英语内容#2 第2版德语翻译 #2 v1的法语翻译
我得到了结果

<Books> <Para> <EN>English Content #1</EN> <DE>German Trans of #1 v1</DE> <DE>German Trans of #1 v2</DE> <FR>French Trans of #1 v1</FR> <FR>French Trans of #1 v2</FR> </Para> <Para> <EN>English Content #2</EN> <DE>German Trans of #2 v1</DE> <DE>German Trans of #2 v3</DE> <DE>German Trans of #2 v2</DE> <FR>French Trans of #2 v1</FR> </Para> </Books>

英语内容#1 第1版德语翻译第1版德语翻译 #1 v1的法语翻译 #1 v2的法文译文英语内容#2 第2版德语翻译 #2 v3的德语翻译第2版德语翻译 #2 v1的法语翻译
您的示例令人困惑，尤其是由于重复的
值。你能展示一下你在XSLT上的第一次尝试吗，来展示你现有的逻辑吗？+1是关于按内容分组元素的好问题。好问题，+1。请参阅我的答案，了解一种即使对于具有完全相同翻译的相近语言也能正确工作的解决方案：）当您询问有关XSLT分组的问题时，答案将完全不同，这取决于您使用的是XSLT 1.0还是XSLT 2.0，因此您确实需要使用XSLT
<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output indent="yes" /> <xsl:key name="english" match="EN" use="." /> <xsl:key name="others" match="Para/*[not(self::EN)]" use="concat(../EN, ' ', ., ' ', name())" /> <xsl:template match="/Books"> <Books> <xsl:for-each select="Para/EN[generate-id() = generate-id(key('english', .)[1])]"> <Para> <xsl:copy-of select=".|key('english', .)/../*[not(self::EN)][generate-id() = generate-id(key('others', concat(current(), ' ', ., ' ', name()))[1])]" /> </Para> </xsl:for-each> </Books> </xsl:template> </xsl:stylesheet>

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0"> <xsl:strip-space elements="*"/> <xsl:output indent="yes"/> <xsl:template match="Books"> <xsl:copy> <xsl:for-each-group select="Para" group-by="EN"> <xsl:apply-templates select="."/> </xsl:for-each-group> </xsl:copy> </xsl:template> <xsl:template match="Para"> <xsl:copy> <xsl:copy-of select="EN"/> <xsl:for-each-group select="current-group()/(* except EN)" group-by="node-name(.)"> <xsl:for-each-group select="current-group()" group-by="."> <xsl:copy-of select="."/> </xsl:for-each-group> </xsl:for-each-group> </xsl:copy> </xsl:template> </xsl:stylesheet>

<Books>  <Para> <EN>English Content #1</EN> <DE>German Trans of #1 v1</DE> <FR>French Trans of #1 v1</FR>  </Para> <Para> <EN>English Content #1</EN> <DE>German Trans of #1 v2</DE> <FR>French Trans of #1 v1</FR>  </Para> <Para> <EN>English Content #1</EN> <DE>German Trans of #1 v1</DE> <FR>French Trans of #1 v2</FR>  </Para>  <Para> <EN>English Content #2</EN> <DE>German Trans of #2 v1</DE> <FR>French Trans of #2 v1</FR>  </Para> <Para> <EN>English Content #2</EN> <DE>German Trans of #2 v3</DE> <FR>French Trans of #2 v1</FR>  </Para> <Para> <EN>English Content #2</EN> <DE>German Trans of #2 v2</DE> <FR>French Trans of #2 v1</FR>  </Para> </Books>

<Books> <Para> <EN>English Content #1</EN> <DE>German Trans of #1 v1</DE> <DE>German Trans of #1 v2</DE> <FR>French Trans of #1 v1</FR> <FR>French Trans of #1 v2</FR> </Para> <Para> <EN>English Content #2</EN> <DE>German Trans of #2 v1</DE> <DE>German Trans of #2 v3</DE> <DE>German Trans of #2 v2</DE> <FR>French Trans of #2 v1</FR> </Para> </Books>