Neo4j中的sparql基准查询速度慢
我正在尝试使用neo4j实现SPARQL查询。我已经使用三元组创建了Neo4j图 为了总结数据加载,我的图表有以下结构:Neo4j中的sparql基准查询速度慢,neo4j,cypher,graph-databases,Neo4j,Cypher,Graph Databases,我正在尝试使用neo4j实现SPARQL查询。我已经使用三元组创建了Neo4j图 为了总结数据加载,我的图表有以下结构: Subject => Node Predicate => Relationship Object => Node 若谓词是日期、字符串、整数原语,那个么将创建一个属性而不是关系,并存储在节点中 现在,我正在尝试以下在Noe4j中非常慢的查询 Query 4: Feature with the highest ratio between pric
Subject => Node
Predicate => Relationship
Object => Node
若谓词是日期、字符串、整数原语,那个么将创建一个属性而不是关系,并存储在节点中
现在,我正在尝试以下在Noe4j中非常慢的查询
Query 4: Feature with the highest ratio between price with that feature and price without that feature.
corresponding SPARQL query for this,
prefix bsbm: <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/>
prefix bsbm-inst: <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/instances/>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
Select ?feature ((?sumF*(?countTotal-?countF))/(?countF*(?sumTotal-?sumF)) As ?priceRatio)
{
{ Select (count(?price) As ?countTotal) (sum(xsd:float(str(?price))) As ?sumTotal)
{
?product a <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/instances/ProductType294> .
?offer bsbm:product ?product ;
bsbm:price ?price .
}
}
{ Select ?feature (count(?price2) As ?countF) (sum(xsd:float(str(?price2))) As ?sumF)
{
?product2 a <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/instances/ProductType294> ;
bsbm:productFeature ?feature .
?offer2 bsbm:product ?product2 ;
bsbm:price ?price2 .
}
Group By ?feature
}
}
Order By desc(?priceRatio) ?feature
Limit 100
Cypher query I created for this,
MATCH p1 = (offer1:Offer)-[r1:`product`]->(products1:ProductType294)
MATCH p2 = (offer2:Offer)-[r2:`product`]->products2:ProductType294)-[:`productFeature`]->features
return (sum( DISTINCT offer2.price) * ( count( DISTINCT offer1.price) - count( DISTINCT offer2.price)) /(count(DISTINCT offer2.price)*(sum( DISTINCT offer1.price) - sum(DISTINCT offer2.price)))) AS cnt,features.__URI__ AS frui
ORDER BY cnt DESC,frui
这个查询真的很慢,请让我知道我是否以错误的方式制定了查询
Another query is Query 5: Show the most popular products of a specific product type for each country - by review count ,
prefix bsbm: <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/>
prefix bsbm-inst: <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/instances/>
prefix rev: <http://purl.org/stuff/rev#>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
Select ?country ?product ?nrOfReviews ?avgPrice
{
{ Select ?country (max(?nrOfReviews) As ?maxReviews)
{
{ Select ?country ?product (count(?review) As ?nrOfReviews)
{
?product a <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/instances/ProductType403> .
?review bsbm:reviewFor ?product ;
rev:reviewer ?reviewer .
?reviewer bsbm:country ?country .
}
Group By ?country ?product
}
}
Group By ?country
}
{ Select ?product (avg(xsd:float(str(?price))) As ?avgPrice)
{
?product a <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/instances/ProductType403> .
?offer bsbm:product ?product .
?offer bsbm:price ?price .
}
Group By ?product
}
{ Select ?country ?product (count(?review) As ?nrOfReviews)
{
?product a <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/instances/ProductType403> .
?review bsbm:reviewFor ?product .
?review rev:reviewer ?reviewer .
?reviewer bsbm:country ?country .
}
Group By ?country ?product
}
FILTER(?nrOfReviews=?maxReviews)
}
Order By desc(?nrOfReviews) ?country ?product
Cypher query I created for this is following,
MATCH (products2:ProductType403)<-[:`reviewFor`]-(reviews:Review)-[:`reviewer`]->(rvrs)-[:`country`]->(countries)
with count(reviews) AS reviewcount,products2.__URI__ AS pruis, countries.__URI__ AS cntrs
MATCH (products1:ProductType403)<-[:`product`]-(offer:Offer)
with AVG(offer.price) AS avgPrice, MAX(reviewcount) AS maxrevs, cntrs
MATCH (products2:ProductType403)<-[:`reviewFor`]-(reviews:Review)-[:`reviewer`]->(rvrs)-[:`country`]->(countries)
with avgPrice, maxrevs,countries, count(reviews) AS rvs, countries.__URI__ AS curis, products2.__URI__ AS puris
where maxrevs=rvs
RETURN curis,puris,rvs,avgPrice
甚至这个查询也非常慢。我是否以正确的方式表述查询
我有1000万个三倍的柏林基准数据集
每个类型谓词都转换为标签。
对于查询4,我试图得到的是价格与价格之间的比率最高的特性
该功能和没有该功能的价格。这是去巴黎的路吗
制定查询?
对于查询4,我得到了此查询的正确结果。
如果我不计算总和和计数,那么查询就会很快执行。
提前感谢:SPARQL查询和信息可以在以下位置找到:对我来说,这些看起来像全局图查询? 数据集的大小是多少 在两条路径之间创建笛卡尔积? 这两条路不应该有某种联系吗 ProductType标签上不应该有属性类型吗产品类型{type:294} 如果有的话,你会在:ProductTypetype和:OrderNo上有一个索引 我真的不明白这个计算 计算不同价格的增量乘以报价的不同价格之和2 通过 报价的不同价格计数2,乘以两个订单价格之和的增量
MATCH (offer1:Offer)-[r1:`product`]->(products1:ProductType294)
MATCH (offer2:Offer)-[r2:`product`]->(products2:ProductType294)-[:`productFeature`]->features
RETURN (sum( DISTINCT offer2.price) *
( count( DISTINCT offer1.price) - count( DISTINCT offer2.price))
/ (count(DISTINCT offer2.price)*
(sum( DISTINCT offer1.price) - sum(DISTINCT offer2.price))))
AS cnt,features.__URI__ AS frui
ORDER BY cnt DESC,frui
也许您可以共享导入的数据库,并添加正在使用的图形模型的模型图片。那我们可以试着帮你。仅仅从描述来看,我真的不明白你在做什么。除了您的查询似乎是图形全局查询之外。