Tree 提取给定节点的所有父节点
我试图提取每个给定GO Id(节点)的所有父节点,使用,我是基于类似的问题来制定查询的,以下两个示例说明了问题: 示例1(): 在本例中,使用以下查询:Tree 提取给定节点的所有父节点,tree,sparql,ontology,virtuoso,Tree,Sparql,Ontology,Virtuoso,我试图提取每个给定GO Id(节点)的所有父节点,使用,我是基于类似的问题来制定查询的,以下两个示例说明了问题: 示例1(): 在本例中,使用以下查询: PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX dbpedia2: <http://dbpedia.org/property/>
PREFIX dbpedia: <http://dbpedia.org/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX obo: <http://purl.obolibrary.org/obo/>
SELECT (count(?mid) as ?depth)
(group_concat(distinct ?midId ; separator = " / ") AS ?treePath)
FROM <http://rdf.ebi.ac.uk/dataset/go>
WHERE {
obo:GO_0032259 rdfs:subClassOf* ?mid .
?mid rdfs:subClassOf* ?class .
?mid <http://www.geneontology.org/formats/oboInOwl#id> ?midId.
}
GROUP BY ?treePath
ORDER BY ?depth
但是,当术语存在于多个分支(例如,如以下情况下的GO:0007267)中时,前面的方法不起作用:
示例2()
结果是:
c | treePath
--|---------------------------------------------------------------
15| GO:0007154 / GO:0007267 / GO:0008150 / GO:0009987 / GO:0023052
depth | midId
------|------------
1 | GO:0008150
2 | GO:0008152
3 | GO:0032259
我想得到的是:
GO:0008150 / GO:0009987 / GO:0007154 / GO:0007267
GO:0008150 / GO:0023052 / GO:0007267
我所理解的是,在引擎盖下,我计算每一层的深度,并用它来构建路径,当我们有一个元素只属于一个分支时,这很好
SELECT (count(?mid) as ?depth) ?midId
FROM <http://rdf.ebi.ac.uk/dataset/go>
WHERE {
obo:GO_0032259 rdfs:subClassOf* ?mid .
?mid rdfs:subClassOf* ?class .
?mid <http://www.geneontology.org/formats/oboInOwl#id> ?midId.
}
GROUP BY ?midId
ORDER BY ?depth
在第二个例子中,遗漏了一些东西,我不明白为什么,在任何方面,我确信问题的一部分是具有相同深度/级别的术语,但我不知道如何解决这个问题
depth | midId
------|------------
2 | GO:0008150
2 | GO:0009987
2 | GO:0023052
3 | GO:0007154
6 | GO:0007267
多亏了@AKSW,我找到了一个不错的解决方案,使用(一个用于查询和服务Web上链接数据的界面) 我会在这里留下详细的答案,它可能会帮助一些人
config.json
文件:
{
"name": "ebi-hgql",
"schema": "ebischema.graphql",
"server": {
"port": 8081,
"graphql": "/graphql",
"graphiql": "/graphiql"
},
"services": [
{
"id": "ebi-sparql",
"type": "SPARQLEndpointService",
"url": "http://www.ebi.ac.uk/rdf/services/sparql",
"graph": "http://rdf.ebi.ac.uk/dataset/go",
"user": "",
"password": ""
}
]
}
下面是我的ebischema.graphql
文件的样子(因为我只需要类
、id
、标签
和子类
):
{
Class_GET_BY_ID(uris:[
"http://purl.obolibrary.org/obo/GO_0032259",
"http://purl.obolibrary.org/obo/GO_0007267"]) {
id
label
subClassOf {
id
label
subClassOf {
id
label
}
}
}
}
我得到了一些有趣的结果:
{
"extensions": {},
"data": {
"@context": {
"_type": "@type",
"_id": "@id",
"id": "http://www.geneontology.org/formats/oboInOwl#id",
"label": "http://www.w3.org/2000/01/rdf-schema#label",
"Class_GET_BY_ID": "http://hypergraphql.org/query/Class_GET_BY_ID",
"subClassOf": "http://www.w3.org/2000/01/rdf-schema#subClassOf"
},
"Class_GET_BY_ID": [
{
"id": [
"GO:0032259"
],
"label": [
"methylation"
],
"subClassOf": [
{
"id": [
"GO:0008152"
],
"label": [
"metabolic process"
],
"subClassOf": [
{
"id": [
"GO:0008150"
],
"label": [
"biological_process"
]
}
]
}
]
},
{
"id": [
"GO:0007267"
],
"label": [
"cell-cell signaling"
],
"subClassOf": [
{
"id": [
"GO:0007154"
],
"label": [
"cell communication"
],
"subClassOf": [
{
"id": [
"GO:0009987"
],
"label": [
"cellular process"
]
}
]
},
{
"id": [
"GO:0023052"
],
"label": [
"signaling"
],
"subClassOf": [
{
"id": [
"GO:0008150"
],
"label": [
"biological_process"
]
}
]
}
]
}
]
},
"errors": []
}
{
Class_GET_BY_ID(uris:[
"http://purl.obolibrary.org/obo/GO_0032259",
"http://purl.obolibrary.org/obo/GO_0007267"]) {
id
label
subClassOf {
id
label
subClassOf {
id
label
subClassOf { # <--- 4th sublevel
id
label
}
}
}
}
}
{
类按ID获取(URI:[
"http://purl.obolibrary.org/obo/GO_0032259",
"http://purl.obolibrary.org/obo/GO_0007267"]) {
身份证件
标签
子类{
身份证件
标签
子类{
身份证件
标签
子类{#SPARQL不可能。它不关心树中的分支,只关心模式匹配。在查询中无法区分分支,因此无法遍历每个分支。注意,这不是SPARQL的目的,它不是一种图形遍历语言。GraphQL、Gremlin等将是适用于此用例的更好的语言。注意,我只是指使用单个SPARQL查询的解决方案。实际上,您可以在客户端使用多次迭代执行的查询遍历树中的路径。@AKSW感谢您的快速回答,我创建了(使用本体查找服务API)一个递归函数,它遍历树并获取所有父级,但执行起来花费了很多时间,而且我担心使用您建议的多个查询时会遇到同样的问题(我正在遍历一个具有+50k GO ID的文件);我将查看GraphQL和Gremlin。由于您的问题是针对的,我建议您将其带到Virtuoso开发人员可以更快提供帮助的地方。Virtuoso不支持GraphQL和Gremlin,但可能还有其他方法来实现您的目标。同样值得注意的是,EMBL-EBI仍在使用7.2.4.2版本运行(07.20.3217
),并应鼓励升级至7.2.5.1(07.20.3229
自2018年8月起)或更高版本。
type __Context {
Class: _@href(iri: "http://www.w3.org/2002/07/owl#Class")
id: _@href(iri: "http://www.geneontology.org/formats/oboInOwl#id")
label: _@href(iri: "http://www.w3.org/2000/01/rdf-schema#label")
subClassOf: _@href(iri: "http://www.w3.org/2000/01/rdf-schema#subClassOf")
}
type Class @service(id:"ebi-sparql") {
id: [String] @service(id:"ebi-sparql")
label: [String] @service(id:"ebi-sparql")
subClassOf: [Class] @service(id:"ebi-sparql")
}
{
Class_GET_BY_ID(uris:[
"http://purl.obolibrary.org/obo/GO_0032259",
"http://purl.obolibrary.org/obo/GO_0007267"]) {
id
label
subClassOf {
id
label
subClassOf {
id
label
}
}
}
}
{
"extensions": {},
"data": {
"@context": {
"_type": "@type",
"_id": "@id",
"id": "http://www.geneontology.org/formats/oboInOwl#id",
"label": "http://www.w3.org/2000/01/rdf-schema#label",
"Class_GET_BY_ID": "http://hypergraphql.org/query/Class_GET_BY_ID",
"subClassOf": "http://www.w3.org/2000/01/rdf-schema#subClassOf"
},
"Class_GET_BY_ID": [
{
"id": [
"GO:0032259"
],
"label": [
"methylation"
],
"subClassOf": [
{
"id": [
"GO:0008152"
],
"label": [
"metabolic process"
],
"subClassOf": [
{
"id": [
"GO:0008150"
],
"label": [
"biological_process"
]
}
]
}
]
},
{
"id": [
"GO:0007267"
],
"label": [
"cell-cell signaling"
],
"subClassOf": [
{
"id": [
"GO:0007154"
],
"label": [
"cell communication"
],
"subClassOf": [
{
"id": [
"GO:0009987"
],
"label": [
"cellular process"
]
}
]
},
{
"id": [
"GO:0023052"
],
"label": [
"signaling"
],
"subClassOf": [
{
"id": [
"GO:0008150"
],
"label": [
"biological_process"
]
}
]
}
]
}
]
},
"errors": []
}
{
Class_GET_BY_ID(uris:[
"http://purl.obolibrary.org/obo/GO_0032259",
"http://purl.obolibrary.org/obo/GO_0007267"]) {
id
label
subClassOf {
id
label
subClassOf {
id
label
subClassOf { # <--- 4th sublevel
id
label
}
}
}
}
}