Warning: file_get_contents(/data/phpspider/zhask/data//catemap/7/neo4j/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Neo4J协同过滤比预期慢_Neo4j_Query Optimization_Cypher_Query Performance_Collaborative Filtering - Fatal编程技术网

Neo4J协同过滤比预期慢

Neo4J协同过滤比预期慢,neo4j,query-optimization,cypher,query-performance,collaborative-filtering,Neo4j,Query Optimization,Cypher,Query Performance,Collaborative Filtering,我正在Neo4J图上实现一个推荐系统,我刚开始研究我计划使用的查询,但它的执行速度比我预期的慢得多 统计数据 Neo4J Version: 2.3.1 Nodes: 820K Relationships: 7.6M Indexes ON :LIKES(created_at) ONLINE ON :Product(id) ONLINE ON :Product(created_at) ONLINE ON :User(id) ON

我正在Neo4J图上实现一个推荐系统,我刚开始研究我计划使用的查询,但它的执行速度比我预期的慢得多

统计数据

Neo4J Version: 2.3.1
Nodes: 820K
Relationships: 7.6M
Indexes
ON :LIKES(created_at)     ONLINE  
ON :Product(id)           ONLINE  
ON :Product(created_at)   ONLINE  
ON :User(id)              ONLINE  
ON :User(date_joined)     ONLINE

No constraints
我已经对查询优化做了很多研究,但就我所见,我没有在查询结构中犯任何常见/常见的错误(但我不是专家)

这是一个带有测试数据集的开发人员控制台:

查询

MATCH (u1:User {id: {user_id}})-[l1:LIKES]->(p1:Product)
WITH u1, l1, p1
ORDER BY p1.created_at DESC
LIMIT 10

MATCH (p1)<-[:LIKES]-(u2:User)
WHERE NOT u1=u2
WITH u1, l1, p1, u2, COUNT(u2) as rating
ORDER BY rating DESC
LIMIT 50

MATCH (u2)-[l2:LIKES]->(recommendation:Product)
WHERE NOT (p1)=(recommendation)
WITH recommendation, COUNT(recommendation) as weight
RETURN recommendation.id as id
ORDER BY weight DESC
LIMIT {limit}
查询配置文件输出(对照我们的生产数据集副本)

+-------------------+----------------+--------+---------+--------------------------------------------+---------------------------------------------------------+
|运算符|估计行|行|数据库命中数|标识符|其他|
+-------------------+----------------+--------+---------+--------------------------------------------+---------------------------------------------------------+
|+生产结果| 7 | 100 | 0 | id | id|
| |                 +----------------+--------+---------+--------------------------------------------+---------------------------------------------------------+
|+投影| 7 | 100 | 0 | anon[382],id,建议,权重| anon[382]|
| |                 +----------------+--------+---------+--------------------------------------------+---------------------------------------------------------+
|+Top | 7 | 100 | 0 | anon[382],推荐,权重|文字(100);重量|
| |                 +----------------+--------+---------+--------------------------------------------+---------------------------------------------------------+
|+投影| 7 | 129342 | 129342 | anon[382],建议,权重|建议id;重量|
| |                 +----------------+--------+---------+--------------------------------------------+---------------------------------------------------------+
|加总| 7 | 129342 | 0 |推荐,权重|推荐|
| |                 +----------------+--------+---------+--------------------------------------------+---------------------------------------------------------+
|+过滤器| 44 | 442432 | 471953 | l1、l2、p1、评级、建议、u1、u2和(非(p1=建议)、建议:产品)|
| |                 +----------------+--------+---------+--------------------------------------------+---------------------------------------------------------+
|+扩展(全部)| 44 | 472039 | 472089 | l1、l2、p1、评级、推荐、u1、u2 |(u2)-[l2:LIKES]->(推荐)|
| |                 +----------------+--------+---------+--------------------------------------------+---------------------------------------------------------+
|+Top | 10 | 50 | 0 | l1、p1、评级、u1、u2 |文字(50);评级|
| |                 +----------------+--------+---------+--------------------------------------------+---------------------------------------------------------+
|+10 | 527 | 0 | l1,p1,评级,u1,u2 | u1,l1,p1,u2|
| |                 +----------------+--------+---------+--------------------------------------------+---------------------------------------------------------+
|+过滤器| 92 | 563 | 563 | anon[82],anon[119],l1,p1,u1,u2 | Ands(非(u1==u2),u2:用户)|
| |                 +----------------+--------+---------+--------------------------------------------+---------------------------------------------------------+
|+扩展(全部)| 92 | 574 | 584 | anon[82],anon[119],l1,p1,u1,u2 |(p1)(p1)|
| |                 +----------------+--------+---------+--------------------------------------------+---------------------------------------------------------+
|+NodeIndexSeek | 1 | 1 | 2 | u1 |:用户(id)|
+-------------------+----------------+--------+---------+--------------------------------------------+---------------------------------------------------------+
我看过一些案例研究,其中人们正在使用Neo4j进行实时协同过滤,因此我认为一定有可能在这种数据集上进行这种查询。我是否不切实际?我们在AmazonEC2计算优化节点(c4.large)上运行这个程序,所以我认为它的性能相当好

我在这里挠头,非常感谢任何意见

干杯, David.

[旁白:重新打开开发人员控制台时,不会重新创建索引,因此必须手动重新创建索引。]

我不知道这对您来说是否足够好,但您可以通过简单地不指定查询中大多数节点(
p1
u2
建议
)的标签来消除分析结果中约44%的DB命中:

MATCH (u1:User {id: {user_id}})-[l1:LIKES]->(p1)
WITH u1, l1, p1
ORDER BY p1.created_at DESC
LIMIT 10

MATCH (p1)<-[:LIKES]-(u2)
WHERE NOT u1=u2
WITH u1, l1, p1, u2, COUNT(u2) as rating
ORDER BY rating DESC
LIMIT 50

MATCH (u2)-[l2:LIKES]->(recommendation)
WHERE NOT (p1)=(recommendation)
WITH recommendation, COUNT(recommendation) as weight
RETURN recommendation.id as id
ORDER BY weight DESC
LIMIT {limit}
MATCH(u1:User{id:{User\u id}})-[l1:LIKES]>(p1)
用u1,l1,p1
由p1.U在DESC创建的订单
限制10
匹配(p1)(推荐)
如果不是(p1)=(建议)
对于推荐,将(推荐)计算为
MATCH (u1:User {id: {user_id}})-[l1:LIKES]->(p1)
WITH u1, l1, p1
ORDER BY p1.created_at DESC
LIMIT 10

MATCH (p1)<-[:LIKES]-(u2)
WHERE NOT u1=u2
WITH u1, l1, p1, u2, COUNT(u2) as rating
ORDER BY rating DESC
LIMIT 50

MATCH (u2)-[l2:LIKES]->(recommendation)
WHERE NOT (p1)=(recommendation)
WITH recommendation, COUNT(recommendation) as weight
RETURN recommendation.id as id
ORDER BY weight DESC
LIMIT {limit}