Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/xslt/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Xml XSLT统计特定子元素的共现次数_Xml_Xslt - Fatal编程技术网

Xml XSLT统计特定子元素的共现次数

Xml XSLT统计特定子元素的共现次数,xml,xslt,Xml,Xslt,我试图统计xml文档中事件记录中特定人员的共同出现次数。我的源文档由事件元素组成,事件元素包含p元素中的散文和bibl元素中的书目记录,两者都包含对人的引用。我希望能够计算出在整个文档中,两个人一起出现在事件中的频率。我一直在使用XSLT2.0,但可以切换到3.0 例如,南希·德鲁(Nancy Drew)和迪克·特蕾西(Dick Tracy)一起参加以下活动的次数,我如何才能得到答案3?还是给迪克·特蕾西和萨姆·斯派德一个 <listEvent> <event

我试图统计xml文档中事件记录中特定人员的共同出现次数。我的源文档由事件元素组成,事件元素包含p元素中的散文和bibl元素中的书目记录,两者都包含对人的引用。我希望能够计算出在整个文档中,两个人一起出现在事件中的频率。我一直在使用XSLT2.0,但可以切换到3.0

例如,南希·德鲁(Nancy Drew)和迪克·特蕾西(Dick Tracy)一起参加以下活动的次数,我如何才能得到答案3?还是给迪克·特蕾西和萨姆·斯派德一个

<listEvent>
        <event xml:id="e1">
           <p>pretium eget erat eu cursus. Duis pulvinar lectus sed quam vehicula tincidunt in
              vel nunc. Cras convallis elementum diam. Sed nec viverra magna. Then <name
                 SameAs="detectives.xml#ND">Nancy Drew</name> solved the case. A consequat
              tortor molestie ut. Praesent lobortis ipsum sit amet bibendum consequat. </p>

           <bibl><name SameAs="detectives.xml#DT">Tracy, Dick</name>. The Mysterious Case of the
              Orange Fish. Penguin Publishing. </bibl>
           <bibl><name SameAs="detectives.xml#SH">Holmes, Sherlock</name>. The Case of the Blue
              Carbuncle Penguin Publishing. </bibl>

        </event>
        <event xml:id="e2">
           <p> facilisis turpis eu, gravida enim. Mauris adipiscing magna consequat dolor
              auctor, sit amet tincidunt felis auctor. <name SameAs="detectives.xml#ND">Nancy
                 Drew</name> and <name SameAs="detectives.xml#DT">Dick Tracy</name> went into
              business together. Aliquam pharetra semper erat, at viverra tellus vestibulum
              quis. Sed facilisis convallis justo, suscipit fermentum lorem egestas nec.
              Phasellus in aliquam eros, vitae fringilla augue </p>

           <bibl><name SameAs="detectives.xml#TH">Hardy, Tom</name>. Growing Up Is Hard to Do:
              The Story of a Boy Detective. Knopf Press. </bibl>
           <bibl><name SameAs="detectives.xml#SH">Holmes, Sherlock</name>. The Case of the Blue
              Carbuncle. Penguin Publishing. </bibl>
           <bibl><name SameAs="detectives.xml#SH">Holmes, Sherlock</name>. The Hound of the
              Baskervilles. Arsenal Press. </bibl>

        </event>
        <event xml:id="e3">
           <p> Curabitur dapibus eu ligula sed elementum. Curabitur sit amet nisi dictum. <name
                 SameAs="detectives.xml#SS">Sam Spade</name> was the only detective in town.
              Donec cursus diam sem, astor. </p>

           <bibl><name SameAs="detectives.xml#TH">Hardy, Tom</name>. Growing Up Is Hard to Do:
              The Story of a Boy Detective. Knopf Press. </bibl>
           <bibl><name SameAs="detectives.xml#SS">Spade, Sam</name>. My Friends' Business
              Ventures. Knopf Press. </bibl>
           <bibl><name SameAs="detectives.xml#DN">Drew, Nancy</name>. Blonde and Curious.
              Arsenal Press.</bibl>

        </event>
        <event xml:id="e4">
           <p> Duis pulvinar lectus sed quam vehicula tincidunt in vel nunc. <name
                 SameAs="detectives.xml#ND">Nancy Drew</name> and <name
                 SameAs="detectives.xml#DT">Dick Tracy</name> made 110% profit that year. Cras
              convallis elementum diam. Sed nec viverra magna. A consequat tortor molestie ut.
              Praesent lobortis ipsum sit amet bibendum consequat. </p>

           <bibl><name SameAs="detectives.xml#SS">Spade, Sam</name>. My Friends' Business
              Ventures. Knopf Press. </bibl>
           <bibl><name SameAs="detectives.xml#MH">Holmes, Mycroft</name>. Sons and Brothers.
              Knopf Press. </bibl>
        </event>
     </listEvent>
。。。其中@weight值是我在计算时遇到的问题

我已经设法给每个人分配了一个节点@id。节点@id然后组成@source和@target值。第一个是Sam Spade和Dick Tracy,第二个是Sam Spade和Nancy Drew,@weight应该是它们在一个文档中同时出现的次数。我简化了我的示例,这可能会让人恼火。在我的实际源文档中,每个元素中都有一堆其他属性和值,包括每个人名的@n,因此使用select值填充@id、@sources和@target是一个简单的过程

@蒂姆,不用担心,@SameAs指向一个权威列表,因此无论文本中个人的名字如何拼写,例如露西、格雷厄姆小姐和L.福斯特夫人,在文本中都可能是同一个女人的名字,比如女孩,在她结婚之前和之后,或者像书目条目中的情况一样颠倒,它可以分解为一个人

不用担心,@SameAs指向一个权威列表

好的,XSLT依赖于XML源文档中的内容,因此在解析不同的@SameAs值之前,这里需要进行必要的计数

在我的实际源文档中,还有一堆其他属性和 每个元素中的值,包括每个人名的@n

好的,既然没有,我就使用@SameAs属性,好像它是一个不同的id一样。下面实际上是一个XSLT 1.0样式表,由EXSLT集合增强:distinct函数。这只是一个草图,有一些脚手架留在里面,所以我们可以看到它是否朝着正确的方向发展

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:set="http://exslt.org/sets"
extension-element-prefixes="set">
<xsl:output method="xml" version="1.0" encoding="utf-8" indent="yes"/>

<xsl:key name="eventByID" match="event" use=".//name/@SameAs" />

<xsl:variable name="distinct_nodes" select="set:distinct(/listEvent/event//name/@SameAs)" />
<xsl:variable name="root" select="/" />

<xsl:template match="/">
<graph>
    <nodes>
        <xsl:for-each select="$distinct_nodes">
            <node id="{.}"/>
        </xsl:for-each>
    </nodes>
    <edges>
        <xsl:for-each select="$distinct_nodes[not(position()=last())]">
            <xsl:variable name="source" select="." />
            <xsl:variable name="pos" select="position()" />
                <xsl:for-each select="$distinct_nodes[position()>$pos]">
                    <xsl:variable name="target" select="." />
                    <xsl:variable name="common_events" select="key('eventByID', $source)[@xml:id=key('eventByID', $target)/@xml:id]" />
                    <xsl:if test="$common_events">
                        <edge source="{$source}" target="{$target}" weight="{count($common_events)}">
                        <!-- use this for test purposes -->
                            <!-- 
                            <xsl:for-each select="$common_events">
                                <event id="{@xml:id}"/>
                            </xsl:for-each>
                             -->
                        </edge>
                    </xsl:if>
                </xsl:for-each>
        </xsl:for-each>
    </edges>
</graph>
</xsl:template>
</xsl:stylesheet>
应用于示例XML,结果是:

<?xml version="1.0" encoding="utf-8"?>
<graph>
   <nodes>
      <node id="detectives.xml#ND"/>
      <node id="detectives.xml#DT"/>
      <node id="detectives.xml#SH"/>
      <node id="detectives.xml#TH"/>
      <node id="detectives.xml#SS"/>
      <node id="detectives.xml#DN"/>
      <node id="detectives.xml#MH"/>
   </nodes>
   <edges>
      <edge source="detectives.xml#ND" target="detectives.xml#DT" weight="3"/>
      <edge source="detectives.xml#ND" target="detectives.xml#SH" weight="2"/>
      <edge source="detectives.xml#ND" target="detectives.xml#TH" weight="1"/>
      <edge source="detectives.xml#ND" target="detectives.xml#SS" weight="1"/>
      <edge source="detectives.xml#ND" target="detectives.xml#MH" weight="1"/>
      <edge source="detectives.xml#DT" target="detectives.xml#SH" weight="2"/>
      <edge source="detectives.xml#DT" target="detectives.xml#TH" weight="1"/>
      <edge source="detectives.xml#DT" target="detectives.xml#SS" weight="1"/>
      <edge source="detectives.xml#DT" target="detectives.xml#MH" weight="1"/>
      <edge source="detectives.xml#SH" target="detectives.xml#TH" weight="1"/>
      <edge source="detectives.xml#TH" target="detectives.xml#SS" weight="1"/>
      <edge source="detectives.xml#TH" target="detectives.xml#DN" weight="1"/>
      <edge source="detectives.xml#SS" target="detectives.xml#DN" weight="1"/>
      <edge source="detectives.xml#SS" target="detectives.xml#MH" weight="1"/>
   </edges>
</graph>

你是说你想检查两个名字的所有可能组合吗?也许您应该发布一个示例,说明输出应该是什么样子,代码方面的。您的XML中有一个Nancy Drew和一个Drew,Nancy。你认为这些会被视为不同的名称吗?他们的@SameAs属性也不同。@michael.hor257k我喜欢你的想法。输出应如下所示:…/gefx>我以前从未尝试过EXSLT@michael.hor257k您的结果正是我要查找的,但我在尝试该工作表时,从Saxon EE 9.5.1.3 XTDE1425中得到一个致命错误:找不到名为{}distinct的匹配单参数函数,即使集合名称空间看起来很完美。我尝试了Saxon 6.5.5和Xalan,它们将完成转换,但返回空的节点和图形元素。我的诊断正确吗?我是否使用了错误的处理器?@CLKC它在和中都可以正常工作。在XSLT2.0中,我想您必须编写自己的:我想是这样的;如果使用XSLT2.0处理器,可能会有一些简化。感谢您使用XSLT2.0。我发现了我的错误,非常感谢您提供的样式表。我太傻了,不能被允许投票支持答案,但我会找到一个可以的人,因为你的答案是正确的。
<?xml version="1.0" encoding="utf-8"?>
<graph>
   <nodes>
      <node id="detectives.xml#ND"/>
      <node id="detectives.xml#DT"/>
      <node id="detectives.xml#SH"/>
      <node id="detectives.xml#TH"/>
      <node id="detectives.xml#SS"/>
      <node id="detectives.xml#DN"/>
      <node id="detectives.xml#MH"/>
   </nodes>
   <edges>
      <edge source="detectives.xml#ND" target="detectives.xml#DT" weight="3"/>
      <edge source="detectives.xml#ND" target="detectives.xml#SH" weight="2"/>
      <edge source="detectives.xml#ND" target="detectives.xml#TH" weight="1"/>
      <edge source="detectives.xml#ND" target="detectives.xml#SS" weight="1"/>
      <edge source="detectives.xml#ND" target="detectives.xml#MH" weight="1"/>
      <edge source="detectives.xml#DT" target="detectives.xml#SH" weight="2"/>
      <edge source="detectives.xml#DT" target="detectives.xml#TH" weight="1"/>
      <edge source="detectives.xml#DT" target="detectives.xml#SS" weight="1"/>
      <edge source="detectives.xml#DT" target="detectives.xml#MH" weight="1"/>
      <edge source="detectives.xml#SH" target="detectives.xml#TH" weight="1"/>
      <edge source="detectives.xml#TH" target="detectives.xml#SS" weight="1"/>
      <edge source="detectives.xml#TH" target="detectives.xml#DN" weight="1"/>
      <edge source="detectives.xml#SS" target="detectives.xml#DN" weight="1"/>
      <edge source="detectives.xml#SS" target="detectives.xml#MH" weight="1"/>
   </edges>
</graph>