将多个值与SPARQL（配方成分）进行比较_Sparql_Rdf_Triplestore

将多个值与SPARQL（配方成分）进行比较

sparql rdf

将多个值与SPARQL（配方成分）进行比较,sparql,rdf,triplestore,Sparql,Rdf,Triplestore,我们有相当多的中世纪配方，它们的结构化数据以RDF的形式存储在三重存储（Blazegraph）中。配料是Wikidata项目，每个配方都是配方集合的一部分： <https://example.com/b2.1> schema:recipeIngredient <http://www.wikidata.org/entity/Q10987>, <http://www.wikidata.org/entity/Q15046077>, <http:

我们有相当多的中世纪配方，它们的结构化数据以RDF的形式存储在三重存储（Blazegraph）中。配料是Wikidata项目，每个配方都是配方集合的一部分：

<https://example.com/b2.1> schema:recipeIngredient <http://www.wikidata.org/entity/Q10987>,
    <http://www.wikidata.org/entity/Q15046077>,
    <http://www.wikidata.org/entity/Q15622897>,
    <http://www.wikidata.org/entity/Q42527>.
<https://example.com/b2.1> ex:isPartOfCollection <https://example.com/b2>.

schema:RecipeElement，
,
,
.
例：我是收集的一部分。

现在，我想查询具有完全相同成分的配方，并获取每个集合中相交（即具有完全相同成分的配方）的配方数量和唯一的配方数量（即在任何其他集合中均未找到具有相同成分的配方）。我做了一个非常昂贵的查询：

prefix corema: <https://gams.uni-graz.at/o:corema.ontology#>
prefix schema: <http://schema.org/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT (?coll as ?collectionURI) (count(distinct ?r2) as ?NoOfrecipesWithExcatSameIngredients) (group_concat(?r1;separator=", ") as ?UriOfrecipesWithExcatSameIngredients)
WHERE {
    {
        SELECT (group_concat(?ing1;
                separator=", ") as ?ingr1) ?r1 WHERE 
        {
            ?r1 a schema:Recipe;
                schema:recipeIngredient ?ing1.
        }
        GROUP BY ?r1
        ORDER BY ?r1
    }
    {
        SELECT (group_concat(?ing2;
                separator=", ") as ?ingr2) ?r2 ?coll WHERE 
        {
            ?r2 a schema:Recipe;
                schema:recipeIngredient ?ing2;
                corema:isPartOfCollection ?coll.
        }
        GROUP BY ?r2 ?coll
        ORDER BY ?r2
    }
    FILTER(?ingr1=?ingr2)
    FILTER(?r1!=?r2)
}
GROUP BY ?coll
ORDER BY ?collectionURI
LIMIT 10000

前缀corema:
前缀架构：
前缀xsd：
选择（？coll as？collectionURI）（将（不同的？r2）计数为？NoOfRecipeswithExcatSameingElements）（将组集合（？r1；separator=“，”）作为？UriofRecipeswithExcatSameingElements）
在哪里{
{
选择（组_concat（？ing1；
分隔符=“，”）作为？ingr1）？r1，其中
{
？r1 a模式：配方；
架构：RecipeIngElement？ing1。
}
分组依据？r1
订购人？r1
}
{
选择（组_concat（？ing2；
分隔符=“，”）作为？ingr2）？r2？coll，其中
{
？r2 a模式：配方；
模式：RecipeIngElement？ing2；
科里玛：我是收集的一部分。
}
分组依据？r2？coll
订购人？r2
}
过滤器（？ingr1=？ingr2）
过滤器（？r1！=？r2）
}
按？科尔分组
订购人？collectionURI
限制10000

它是有效的，但1）我不知道如何获得唯一的配方，2）组_concat中的不安全排序问题（graphDB似乎保持稳定，而Blazegraphs每次运行查询时都会给我不同的数字）

输出应为（例如）：

收集交叉配方的数量独特配方的数量与（配方的URI）相交唯一配方URI b2 12 4. ex:b4.2，ex:b5.6，ex:gr1.7 ex:b2.2，ex:b2.6，ex:b2.1

SPARQL中有一个限制：

group\u concat

函数没有排序的概念，因此生成的字符串可以是

“i1，i2”

和

“i2，i1”

，因此会有所不同。即使您在两个子查询中都使用了

ORDER BY

，也无法保证-您可能会幸运地依靠triple store及其实现，尽管您说“它可以工作”，但结果如何？性能确实很差，查询的笛卡尔积可能很昂贵。但是，您可以使用的Blazegraph功能，这样可以避免计算同一子查询两次。性能问题现在已经不存在了。但这确实是一个非常大的问题，即无法订购group_concat:/因此，我目前得到了集合，然后是（假设）相同的配方数量，以及应该是具有相同成分的配方URI。