Neo4j 为什么在使用count()运行cypher查询时要花费10倍的时间?

Neo4j 为什么在使用count()运行cypher查询时要花费10倍的时间?,neo4j,cypher,Neo4j,Cypher,我从以下查询开始: PROFILE MATCH Base = (SBase:Snapshot {timestamp:1454983481.304583})-[:contains]->() MATCH Prime = (:Snapshot {timestamp:1454983521.642284})-[PContains:contains]->(SPrimePackage) WHERE NOT (SBase)-[:contains]->(SPrimePackage) RETURN

我从以下查询开始:

PROFILE
MATCH Base = (SBase:Snapshot {timestamp:1454983481.304583})-[:contains]->()
MATCH Prime = (:Snapshot {timestamp:1454983521.642284})-[PContains:contains]->(SPrimePackage)
WHERE NOT (SBase)-[:contains]->(SPrimePackage)
RETURN PContains
LIMIT 10
我得到“119毫秒内5834分贝的总命中率”。该图正确显示了9个节点,以及连接它们的8条边。然后,我运行一个几乎相同的查询,只是返回count(distinct()):

这给出了“1771毫秒内1382270总db点击”。结果是正确的:8。但是,为什么count(distinct())速度会慢得多,成本也会更高?我应该用别的方法来做吗

我正在运行Neo4j 2.3.1

编辑1

为了确保我在比较苹果和苹果,并突出问题,这里有一对类似的查询和结果:

MATCH Base = (SBase:Snapshot {timestamp:1454983481.304583})-[:contains]->()
MATCH Prime = (:Snapshot {timestamp:1454983521.642284})-[PContains:contains]->(SPrimePackage)
WHERE NOT (SBase)-[:contains]->(SPrimePackage)
RETURN SPrimePackage
LIMIT 10
注意,它返回的是“SPrimePackage”,而不是原来的“PContains”。结果是“740毫秒内总共5834 db命中”

下面是与“count()”完全相同的查询:

结果是:“2731毫秒内总命中1382270分贝”。请注意,唯一的区别是“count()”。直觉上,我希望“count()”添加一个计数步骤,但显然它所做的远不止这些。为什么“count()”会触发所有这些额外的工作?

[更新]

如果比较两个(已编辑)查询的
配置文件
输出,您可能会发现唯一显著的区别是查询的
COUNT()
版本中存在一个操作。聚合函数在实际执行聚合函数(在本例中为
COUNT()
)之前,使用
Aggregation
在内存中收集所有被聚合的数据。如果不使用聚合函数,则需要额外的工作

以下查询仍然使用
COUNT()
来获取计数,但大大减少了必须聚合的数据,从而减少了在
聚合
步骤中需要完成的工作量:

PROFILE
MATCH (SBase:Snapshot { timestamp:1454983481.304583 })
USING INDEX SBase:Snapshot(timestamp)
WHERE (SBase)-[:contains]->()
MATCH (s:Snapshot { timestamp:1454983521.642284 })-[:contains]->(SPrimePackage)
USING INDEX s:Snapshot(timestamp)
WHERE NOT (SBase)-[:contains]->(SPrimePackage)
RETURN COUNT(DISTINCT SPrimePackage)
LIMIT 10;
上述查询假设您已经在
:Snapshot(timestamp)
上创建了索引,以大大加快对2个
:Snapshot
节点的搜索速度:

使用一些简单的数据,我得到的配置文件是:

+-------------------+----------------+------+---------+--------------------------------------+--------------------------------------+
| Operator          | Estimated Rows | Rows | DB Hits | Variables                            | Other                                |
+-------------------+----------------+------+---------+--------------------------------------+--------------------------------------+
| +ProduceResults   |              1 |    1 |       0 | COUNT(DISTINCT SPrimePackage)        | COUNT(DISTINCT SPrimePackage)        |
| |                 +----------------+------+---------+--------------------------------------+--------------------------------------+
| +Limit            |              1 |    1 |       0 | COUNT(DISTINCT SPrimePackage)        | Literal(10)                          |
| |                 +----------------+------+---------+--------------------------------------+--------------------------------------+
| +EagerAggregation |              1 |    1 |       0 | COUNT(DISTINCT SPrimePackage)        |                                      |
| |                 +----------------+------+---------+--------------------------------------+--------------------------------------+
| +AntiSemiApply    |              1 |    7 |       0 | anon[180], s -- SBase, SPrimePackage |                                      |
| |\                +----------------+------+---------+--------------------------------------+--------------------------------------+
| | +Expand(Into)   |              1 |    0 |      34 | anon[266] -- SBase, SPrimePackage    | (SBase)-[:contains]->(SPrimePackage) |
| | |               +----------------+------+---------+--------------------------------------+--------------------------------------+
| | +Argument       |              4 |    8 |       0 | SBase, SPrimePackage                 |                                      |
| |                 +----------------+------+---------+--------------------------------------+--------------------------------------+
| +CartesianProduct |              4 |    8 |       0 | SBase -- anon[180], SPrimePackage, s |                                      |
| |\                +----------------+------+---------+--------------------------------------+--------------------------------------+
| | +Expand(All)    |              4 |    8 |      10 | anon[180], SPrimePackage -- s        | (s)-[:contains]->(SPrimePackage)     |
| | |               +----------------+------+---------+--------------------------------------+--------------------------------------+
| | +NodeIndexSeek  |              2 |    2 |       4 | s                                    | :Snapshot(timestamp)                 |
| |                 +----------------+------+---------+--------------------------------------+--------------------------------------+
| +SemiApply        |              1 |    2 |       0 | SBase                                |                                      |
| |\                +----------------+------+---------+--------------------------------------+--------------------------------------+
| | +Expand(All)    |              4 |    0 |       2 | anon[112], anon[126] -- SBase        | (SBase)-[:contains]->()              |
| | |               +----------------+------+---------+--------------------------------------+--------------------------------------+
| | +Argument       |              2 |    2 |       0 | SBase                                |                                      |
| |                 +----------------+------+---------+--------------------------------------+--------------------------------------+
| +NodeIndexSeek    |              2 |    2 |       3 | SBase                                | :Snapshot(timestamp)                 |
+-------------------+----------------+------+---------+--------------------------------------+--------------------------------------+
除使用索引外,上述查询:

  • 不必费心查找
    SBase
    包含的所有节点,因为我们只需要查找一个包含的节点,就可以识别匹配的
    SBase
    节点。只要找到一个
    (SBase)-[:contains]->()
    匹配项,
    semiply
    操作就会完成,因此第一个
    match
    子句将导致每个
    SBase
    有一行,而不是N行。根据你问题中的信息,我猜N大概是8
  • 有一个笛卡尔乘积,应该是非常快的,因为乘积的两个“腿”都应该有较低的基数

  • 计数不是问题,区别是因为它在大量点击上耗费时间。我认为您正在寻找的解决方案是基于路径的查询,但我不擅长于此,我希望有人能提供您所需要的。您好@Supamiu谢谢您的评论。如果我用count()而不是count(distinct())运行查询,我会得到“1454毫秒内的1382270总db命中率”。然而,我的计数不再是我想要的:3296。有趣的是,3296平均除以8,这是我一直在寻找的答案。哦,我希望在没有明显差异的情况下得到更好的结果。。。尝试向timestamp属性添加索引。这绝对是一个有用的答案。一旦我将“MATCH Base=(SBase:Snapshot{timestamp:…})-[:contains]->()”更改为“MATCH(SBase:Snapshot{timestamp:…})”,结果是1290毫秒内总共有3363次db命中。这是更好的方法,谢谢。但是,严格来说,问题是“为什么SPrimePackage和count(SPrimePackage)之间存在差异?”?如果您可以修改您的答案,突出该问题的答案,我将接受。
    PROFILE
    MATCH (SBase:Snapshot { timestamp:1454983481.304583 })
    USING INDEX SBase:Snapshot(timestamp)
    WHERE (SBase)-[:contains]->()
    MATCH (s:Snapshot { timestamp:1454983521.642284 })-[:contains]->(SPrimePackage)
    USING INDEX s:Snapshot(timestamp)
    WHERE NOT (SBase)-[:contains]->(SPrimePackage)
    RETURN COUNT(DISTINCT SPrimePackage)
    LIMIT 10;
    
    CREATE INDEX ON :Snapshot(timestamp);
    
    +-------------------+----------------+------+---------+--------------------------------------+--------------------------------------+
    | Operator          | Estimated Rows | Rows | DB Hits | Variables                            | Other                                |
    +-------------------+----------------+------+---------+--------------------------------------+--------------------------------------+
    | +ProduceResults   |              1 |    1 |       0 | COUNT(DISTINCT SPrimePackage)        | COUNT(DISTINCT SPrimePackage)        |
    | |                 +----------------+------+---------+--------------------------------------+--------------------------------------+
    | +Limit            |              1 |    1 |       0 | COUNT(DISTINCT SPrimePackage)        | Literal(10)                          |
    | |                 +----------------+------+---------+--------------------------------------+--------------------------------------+
    | +EagerAggregation |              1 |    1 |       0 | COUNT(DISTINCT SPrimePackage)        |                                      |
    | |                 +----------------+------+---------+--------------------------------------+--------------------------------------+
    | +AntiSemiApply    |              1 |    7 |       0 | anon[180], s -- SBase, SPrimePackage |                                      |
    | |\                +----------------+------+---------+--------------------------------------+--------------------------------------+
    | | +Expand(Into)   |              1 |    0 |      34 | anon[266] -- SBase, SPrimePackage    | (SBase)-[:contains]->(SPrimePackage) |
    | | |               +----------------+------+---------+--------------------------------------+--------------------------------------+
    | | +Argument       |              4 |    8 |       0 | SBase, SPrimePackage                 |                                      |
    | |                 +----------------+------+---------+--------------------------------------+--------------------------------------+
    | +CartesianProduct |              4 |    8 |       0 | SBase -- anon[180], SPrimePackage, s |                                      |
    | |\                +----------------+------+---------+--------------------------------------+--------------------------------------+
    | | +Expand(All)    |              4 |    8 |      10 | anon[180], SPrimePackage -- s        | (s)-[:contains]->(SPrimePackage)     |
    | | |               +----------------+------+---------+--------------------------------------+--------------------------------------+
    | | +NodeIndexSeek  |              2 |    2 |       4 | s                                    | :Snapshot(timestamp)                 |
    | |                 +----------------+------+---------+--------------------------------------+--------------------------------------+
    | +SemiApply        |              1 |    2 |       0 | SBase                                |                                      |
    | |\                +----------------+------+---------+--------------------------------------+--------------------------------------+
    | | +Expand(All)    |              4 |    0 |       2 | anon[112], anon[126] -- SBase        | (SBase)-[:contains]->()              |
    | | |               +----------------+------+---------+--------------------------------------+--------------------------------------+
    | | +Argument       |              2 |    2 |       0 | SBase                                |                                      |
    | |                 +----------------+------+---------+--------------------------------------+--------------------------------------+
    | +NodeIndexSeek    |              2 |    2 |       3 | SBase                                | :Snapshot(timestamp)                 |
    +-------------------+----------------+------+---------+--------------------------------------+--------------------------------------+